Most Cited 2025 "causal perspective" Papers

22,274 papers found • Page 57 of 112

#11201

DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup

Zhen Qu, Xian Tao, Xinyi Gong et al.

ICCV 2025arXiv:2508.13560
2
citations
#11202

MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling

Yingyue Li, Bencheng Liao, Wenyu Liu et al.

ICCV 2025arXiv:2503.13440
2
citations
#11203

LiON-LoRA: Rethinking LoRA Fusion to Unify Controllable Spatial and Temporal Generation for Video Diffusion

Yisu Zhang, Chenjie Cao, Chaohui Yu et al.

ICCV 2025arXiv:2507.05678
2
citations
#11204

InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes

Zesong Yang, Bangbang Yang, Wenqi Dong et al.

ICCV 2025arXiv:2507.08416
2
citations
#11205

JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models

Xiaolong Jin, Zixuan Weng, Hanxi Guo et al.

ICCV 2025
2
citations
#11206

Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Seogkyu Jeon, Kibeom Hong, Hyeran Byun

ICCV 2025arXiv:2512.03508
2
citations
#11207

Semantic versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

Yuan Tian, Shuo Wang, Rongzhao Zhang et al.

ICCV 2025arXiv:2507.21703
2
citations
#11208

Supercharging Floorplan Localization with Semantic Rays

Yuval Grader, Hadar Averbuch-Elor

ICCV 2025arXiv:2507.09291
2
citations
#11209

G2SF: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection

Chengyu Tao, Xuanming Cao, Juan Du

ICCV 2025
2
citations
#11210

One Polyp Identifies All: One-Shot Polyp Segmentation with SAM via Cascaded Priors and Iterative Prompt Evolution

Xinyu Mao, Xiaohan Xing, Fei MENG et al.

ICCV 2025arXiv:2507.16337
2
citations
#11211

You Are Your Own Best Teacher: Achieving Centralized-level Performance in Federated Learning under Heterogeneous and Long-tailed Data

Shanshan Yan, Zexi Li, Chao Wu et al.

ICCV 2025arXiv:2503.06916
2
citations
#11212

GT-Mean Loss: A Simple Yet Effective Solution for Brightness Mismatch in Low-Light Image Enhancement

Jingxi Liao, Shijie Hao, Richang Hong et al.

ICCV 2025arXiv:2507.20148
2
citations
#11213

Learn2Synth: Learning Optimal Data Synthesis Using Hypergradients for Brain Image Segmentation

Xiaoling Hu, Xiangrui Zeng, Oula Puonti et al.

ICCV 2025arXiv:2411.16719
2
citations
#11214

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

Sanghyun Jo, Seo Lee, Seungwoo Lee et al.

ICCV 2025arXiv:2503.11439
2
citations
#11215

Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model

Chengxu Liu, Lu Qi, Jinshan Pan et al.

ICCV 2025arXiv:2507.13599
2
citations
#11216

Identity-aware Language Gaussian Splatting for Open-vocabulary 3D Semantic Segmentation

SungMin Jang, Wonjun Kim

ICCV 2025
2
citations
#11217

Structure-aware Semantic Discrepancy and Consistency for 3D Medical Image Self-supervised Learning

Tan Pan, Zhaorui Tan, Kaiyu Guo et al.

ICCV 2025arXiv:2507.02581
2
citations
#11218

DASH: 4D Hash Encoding with Self-Supervised Decomposition for Real-Time Dynamic Scene Rendering

Jie Chen, Zhangchi Hu, Peixi Wu et al.

ICCV 2025arXiv:2507.19141
2
citations
#11219

EVT: Efficient View Transformation for Multi-Modal 3D Object Detection

Yongjin Lee, Hyeon-Mun Jeong, Yurim Jeon et al.

ICCV 2025arXiv:2411.10715
2
citations
#11220

An Inversion-based Measure of Memorization for Diffusion Models

Zhe Ma, Qingming Li, Xuhong Zhang et al.

ICCV 2025arXiv:2405.05846
2
citations
#11221

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications

Omkar Thawakar, Dmitry Demidov, Ritesh Thawkar et al.

ICCV 2025arXiv:2508.14039
2
citations
#11222

Balanced Sharpness-Aware Minimization for Imbalanced Regression

Yahao Liu, Qin Wang, Lixin Duan et al.

ICCV 2025arXiv:2508.16973
2
citations
#11223

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

Qihui Zhang, Munan Ning, Zheyuan Liu et al.

CVPR 2025arXiv:2503.14941
2
citations
#11224

FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation

Wenzhuang Wang, Yifan Zhao, Mingcan Ma et al.

ICCV 2025arXiv:2509.01107
2
citations
#11225

GLEAM: Enhanced Transferable Adversarial Attacks for Vision-Language Pre-training Models via Global-Local Transformations

Yunqi Liu, Xiaohui Cui, Ouyang Xue

ICCV 2025
2
citations
#11226

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

Yong Liu, Song-Li Wu, Sule Bai et al.

ICCV 2025arXiv:2506.16058
2
citations
#11227

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

JIAHE ZHAO, RuiBing Hou, zejie tian et al.

ICCV 2025arXiv:2503.12955
2
citations
#11228

Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining

Qi Fan, Kaiqi Liu, Nian Liu et al.

ICCV 2025arXiv:2504.21414
2
citations
#11229

DOGR: Towards Versatile Visual Document Grounding and Referring

Yinan Zhou, Yuxin Chen, Haokun Lin et al.

ICCV 2025arXiv:2411.17125
2
citations
#11230

ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition

Sanjoy Kundu, Shanmukha Vellamcheti, Sathyanarayanan Aakur

ICCV 2025arXiv:2504.03948
2
citations
#11231

Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion

Enyu Liu, En Yu, Sijia Chen et al.

ICCV 2025arXiv:2507.08555
2
citations
#11232

Adaptive Routing of Text-to-Image Generation Requests Between Large Cloud Model and Light-Weight Edge Model

Zewei Xin, Qinya Li, Chaoyue Niu et al.

ICCV 2025arXiv:2411.13787
2
citations
#11233

AnyPortal: Zero-Shot Consistent Video Background Replacement

Wenshuo Gao, Xicheng Lan, Shuai Yang

ICCV 2025arXiv:2509.07472
2
citations
#11234

LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation

Min Wu Jeong, Chae Eun Rhee

CVPR 2025
2
citations
#11235

SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications

Yana Hasson, Pauline Luc, Liliane Momeni et al.

ICCV 2025arXiv:2507.03578
2
citations
#11236

VSC: Visual Search Compositional Text-to-Image Diffusion Model

Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.

ICCV 2025arXiv:2505.01104
2
citations
#11237

Counting Stacked Objects

Corentin Dumery, Noa Ette, Aoxiang Fan et al.

ICCV 2025arXiv:2411.19149
2
citations
#11238

Bi-Level Optimization for Self-Supervised AI-Generated Face Detection

Mian Zou, Nan Zhong, Baosheng Yu et al.

ICCV 2025highlightarXiv:2507.22824
2
citations
#11239

DCT-Shield: A Robust Frequency Domain Defense against Malicious Image Editing

Aniruddha Bala, Rohit Chowdhury, Rohan Jaiswal et al.

ICCV 2025highlightarXiv:2504.17894
2
citations
#11240

MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances

Yunzhe Shao, Xinyu Yi, Lu Yin et al.

ICCV 2025arXiv:2506.22907
2
citations
#11241

Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models

Young Kyun Jang, Ser-Nam Lim

ICCV 2025arXiv:2405.14715
2
citations
#11242

LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing

Achint Soni, Meet Soni, Sirisha Rambhatla

ICCV 2025arXiv:2503.21541
2
citations
#11243

From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning

Yuhui Zeng, Haoxiang Wu, Wenjie Nie et al.

ICCV 2025arXiv:2502.05843
2
citations
#11244

CarGait: Cross-Attention based Re-ranking for Gait recognition

Gavriel Habib, Noa Barzilay, Or Shimshi et al.

ICCV 2025arXiv:2503.03501
2
citations
#11245

PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior

Seunggwan Lee, Hwanhee Jung, ByoungSoo Koh et al.

ICCV 2025arXiv:2503.12834
2
citations
#11246

Preacher: Paper-to-Video Agentic System

Jingwei Liu, Ling Yang, Hao Luo et al.

ICCV 2025arXiv:2508.09632
2
citations
#11247

MR-FIQA: Face Image Quality Assessment with Multi-Reference Representations from Synthetic Data Generation

Fu-Zhao Ou, Chongyi Li, Shiqi Wang et al.

ICCV 2025
2
citations
#11248

Gait-X: Exploring X modality for Generalized Gait Recognition

Zengbin Wang, Saihui Hou, Junjie Li et al.

ICCV 2025
2
citations
#11249

Adaptive Articulated Object Manipulation On The Fly with Foundation Model Reasoning and Part Grounding

Xiaojie Zhang, Yuanfei Wang, Ruihai Wu et al.

ICCV 2025arXiv:2507.18276
2
citations
#11250

Do Your Best and Get Enough Rest for Continual Learning

Hankyul Kang, Gregor Seifer, Donghyun Lee et al.

CVPR 2025arXiv:2503.18371
2
citations
#11251

Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling

Hayeon Kim, Ji Ha Jang, Se Young Chun

ICCV 2025arXiv:2507.11061
2
citations
#11252

VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data

Jian Shi, Peter Wonka

ICCV 2025arXiv:2312.08871
2
citations
#11253

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding

Huy Ta, Duy Anh Huynh, Yutong Xie et al.

ICCV 2025highlightarXiv:2505.15123
2
citations
#11254

LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models

Mert Sonmezer, Matthew Zheng, Pinar Yanardag

ICCV 2025arXiv:2510.15022
2
citations
#11255

Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation

Guanyi Qin, Ziyue Wang, Daiyun Shen et al.

ICCV 2025highlightarXiv:2507.18944
2
citations
#11256

Autoregressive Denoising Score Matching is a Good Video Anomaly Detector

hanwen Zhang, Congqi Cao, Qinyi Lv et al.

ICCV 2025arXiv:2506.23282
2
citations
#11257

SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior

Bo Zhao, Haoran Wang, Jinghui Wang et al.

ICCV 2025highlightarXiv:2510.15749
2
citations
#11258

RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS

Chuanyu Fu, Yuqi Zhang, Kunbin Yao et al.

ICCV 2025arXiv:2506.02751
2
citations
#11259

Denoising Token Prediction in Masked Autoregressive Models

Ting Yao, Yehao Li, Yingwei Pan et al.

ICCV 2025
2
citations
#11260

Consistency Trajectory Matching for One-Step Generative Super-Resolution

Weiyi You, Mingyang Zhang, Leheng Zhang et al.

ICCV 2025arXiv:2503.20349
2
citations
#11261

LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

Pascal Chang, Sergio Sancho, Jingwei Tang et al.

CVPR 2025arXiv:2504.08902
2
citations
#11262

Supercharged One-step Text-to-Image Diffusion Models with Negative Prompts

Viet Nguyen, Anh Nguyen, Trung Dao et al.

ICCV 2025arXiv:2412.02687
2
citations
#11263

CVPT: Cross Visual Prompt Tuning

Lingyun Huang, Jianxu Mao, Junfei YI et al.

ICCV 2025arXiv:2408.14961
2
citations
#11264

Addressing Text Embedding Leakage in Diffusion-based Image Editing

Sunung Mun, Jinhwan Nam, Sunghyun Cho et al.

ICCV 2025arXiv:2412.04715
2
citations
#11265

InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

Tao Han, Wanghan Xu, Junchao Gong et al.

ICCV 2025arXiv:2509.10441
2
citations
#11266

Efficient Multi-Person Motion Prediction by Lightweight Spatial and Temporal Interactions

Yuanhong Zheng, Ruixuan Yu, Jian Sun

ICCV 2025arXiv:2507.09446
2
citations
#11267

FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models

Mainak Singha, Subhankar Roy, Sarthak Mehrotra et al.

ICCV 2025arXiv:2504.20860
2
citations
#11268

CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models

Junho Kim, Hyungjin Chung, Byung-Hoon Kim

ICCV 2025arXiv:2411.06869
2
citations
#11269

Subjective Camera 1.0: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion

Haoyang Chen, Dongfang Sun, Caoyuan Ma et al.

ICCV 2025arXiv:2506.23711
2
citations
#11270

Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions

Liang Xu, Chengqun Yang, Zili Lin et al.

ICCV 2025arXiv:2508.04681
2
citations
#11271

ViLU: Learning Vision-Language Uncertainties for Failure Prediction

Marc Lafon, Yannis Karmim, Julio Silva-Rodríguez et al.

ICCV 2025arXiv:2507.07620
2
citations
#11272

High-Fidelity Lightweight Mesh Reconstruction from Point Clouds

Chen Zhang, Wentao Wang, Ximeng Li et al.

CVPR 2025highlight
2
citations
#11273

HouseTour: A Virtual Real Estate A(I)gent

Ata Çelen, Iro Armeni, Daniel Barath et al.

ICCV 2025arXiv:2510.18054
2
citations
#11274

Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection

Yehao Lu, Minghe Weng, Zekang Xiao et al.

ICCV 2025arXiv:2507.17436
2
citations
#11275

SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting

Zihui Gao, Jia-Wang Bian, Guosheng Lin et al.

ICCV 2025arXiv:2507.15602
2
citations
#11276

Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces

Aniruddha Mahapatra, Long Mai, David Bourgin et al.

ICCV 2025arXiv:2501.05442
2
citations
#11277

Enhancing Transformers Through Conditioned Embedded Tokens

Hemanth Saratchandran, Simon Lucey

ICCV 2025arXiv:2505.12789
2
citations
#11278

Joint Asymmetric Loss for Learning with Noisy Labels

Jialiang Wang, Xianming Liu, Xiong Zhou et al.

ICCV 2025arXiv:2507.17692
2
citations
#11279

Trade-offs in Image Generation: How Do Different Dimensions Interact?

Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.

ICCV 2025arXiv:2507.22100
2
citations
#11280

LOTA: Bit-Planes Guided AI-Generated Image Detection

Renxi Cheng, Hongsong Wang, Yang Zhang et al.

ICCV 2025arXiv:2510.14230
2
citations
#11281

Minimal Variance Model Aggregation: A principled, non-intrusive, and versatile integration of black box models

Theo Bourdais, Houman Owhadi

ICLR 2025arXiv:2409.17267
2
citations
#11282

DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering

Rongjia Zheng, Qing Zhang, Chengjiang Long et al.

ICCV 2025arXiv:2507.03924
2
citations
#11283

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

Hyungjin Kim, Seokho Ahn, Young-Duk Seo

ICCV 2025arXiv:2508.03481
2
citations
#11284

Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing

Chen Liao, Yan Shen, Dan Li et al.

CVPR 2025arXiv:2503.08429
2
citations
#11285

Stylized-Face: A Million-level Stylized Face Dataset for Face Recognition

Zhengyuan Peng, Jianqing Xu, Yuge Huang et al.

ICCV 2025
2
citations
#11286

MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised Learning

Mohammadreza Salehi, Shashanka Venkataramanan, Ioana Simion et al.

ICCV 2025arXiv:2506.08694
2
citations
#11287

Holistic Tokenizer for Autoregressive Image Generation

Anlin Zheng, Haochen Wang, Yucheng Zhao et al.

ICCV 2025arXiv:2507.02358
2
citations
#11288

PLA: Prompt Learning Attack against Text-to-Image Generative Models

XINQI LYU, Yihao LIU, Yanjie Li et al.

ICCV 2025arXiv:2508.03696
2
citations
#11289

MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective

Weitian Wang, Shubham rai, Cecilia De la Parra et al.

ICCV 2025arXiv:2507.19131
2
citations
#11290

CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation

Xiao Lin, Yun Peng, Liuyi Wang et al.

ICCV 2025arXiv:2502.01312
2
citations
#11291

LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering

Xiaohang Zhan, Dingming Liu

ICCV 2025arXiv:2508.07647
2
citations
#11292

Generalized Few-Shot Point Cloud Segmentation via LLM-Assisted Hyper-Relation Matching

Zhaoyang Li, Yuan Wang, Guoxin Xiong et al.

ICCV 2025
2
citations
#11293

MIRACLE 3D: Memory-efficient Integrated Robust Approach for Continual Learning on 3D Point Clouds via Shape Model Construction

Hossein Resani, Behrooz Nasihatkon

ICLR 2025
2
citations
#11294

Monocular Facial Appearance Capture in the Wild

Yingyan Xu, Kate Gadola, Prashanth Chandran et al.

ICCV 2025arXiv:2412.12765
2
citations
#11295

AMD: Adaptive Momentum and Decoupled Contrastive Learning Framework for Robust Long-Tail Trajectory Prediction

Bin Rao, Haicheng Liao, Yanchen Guan et al.

ICCV 2025arXiv:2507.01801
2
citations
#11296

Deep Incomplete Multi-view Clustering with Distribution Dual-Consistency Recovery Guidance

Jiaqi Jin, Siwei Wang, Zhibin Dong et al.

ICCV 2025arXiv:2503.11017
2
citations
#11297

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Ziwei Wang, Sameera Ramasinghe, Chenchen Xu et al.

ICCV 2025arXiv:2411.17490
2
citations
#11298

D2ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition

Wenjie Pei, Qizhong Tan, Guangming Lu et al.

ICCV 2025
2
citations
#11299

FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy

Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.

CVPR 2025arXiv:2503.17197
2
citations
#11300

DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Revant Teotia, Candace Ross, Karen Ullrich et al.

ICCV 2025arXiv:2506.05108
2
citations
#11301

DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing

Yang JingYi, Xun Lin, Zitong YU et al.

ICCV 2025arXiv:2503.00429
2
citations
#11302

Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization

Bingqing Zhang, Zhuo Cao, Heming Du et al.

ICCV 2025arXiv:2507.15504
2
citations
#11303

Timestep-Aware Diffusion Model for Extreme Image Rescaling

Ce Wang, Zhenyu Hu, Wanjie Sun et al.

ICCV 2025arXiv:2408.09151
2
citations
#11304

Adversarial Attention Perturbations for Large Object Detection Transformers

Zachary Yahn, Selim Tekin, Fatih Ilhan et al.

ICCV 2025arXiv:2508.02987
2
citations
#11305

HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image

Junyi Guo, Jingxuan Zhang, Fangyu Wu et al.

ICCV 2025arXiv:2505.23186
2
citations
#11306

What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models

Lorenzo Baraldi, Davide Bucciarelli, Federico Betti et al.

ICCV 2025arXiv:2505.20405
2
citations
#11307

DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data

Junjie Wu, Jiangtao Xie, Zhaolin Zhang et al.

ICCV 2025arXiv:2504.01386
2
citations
#11308

Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation

Siyu Chen, Ting Han, Changshe Zhang et al.

ICCV 2025arXiv:2504.12753
2
citations
#11309

Aligning Constraint Generation with Design Intent in Parametric CAD

Evan Casey, Tianyu Zhang, Shu Ishida et al.

ICCV 2025arXiv:2504.13178
2
citations
#11310

CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting

Lei Tian, Xiaomin Li, Liqian Ma et al.

ICCV 2025arXiv:2505.20469
2
citations
#11311

SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos et al.

ICCV 2025arXiv:2502.06593
2
citations
#11312

MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion

Fei Peng, Junqiang Wu, Yan Li et al.

ICCV 2025arXiv:2508.14440
2
citations
#11313

StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance

Jaeseok Jeong, Junho Kim, Youngjung Uh et al.

ICCV 2025arXiv:2510.06827
2
citations
#11314

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

Zeyi Sun, Tong Wu, Pan Zhang et al.

ICCV 2025arXiv:2406.00093
2
citations
#11315

MoGA: 3D Generative Avatar Prior for Monocular Gaussian Avatar Reconstruction

Zijian Dong, Longteng Duan, Jie Song et al.

ICCV 2025highlightarXiv:2507.23597
2
citations
#11316

MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation

Yanchen Liu, Yanan SUN, Zhening Xing et al.

ICCV 2025arXiv:2507.16310
2
citations
#11317

GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding

Zijun Lin, Shuting He, Cheston Tan et al.

ICCV 2025arXiv:2506.21188
2
citations
#11318

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Huaiyuan Qin et al.

ICCV 2025highlightarXiv:2507.12857
2
citations
#11319

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions

Bimsara Pathiraja, Maitreya Patel, Shivam Singh et al.

ICCV 2025arXiv:2506.03448
2
citations
#11320

InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild

Yiyi Ma, Yuanzhi Liang, Xiu Li et al.

ICCV 2025arXiv:2508.10297
2
citations
#11321

SPADE: Spatial-Aware Denoising Network for Open-vocabulary Panoptic Scene Graph Generation with Long- and Local-range Context Reasoning

XIN Hu, Ke Qin, Guiduo Duan et al.

ICCV 2025arXiv:2507.05798
2
citations
#11322

Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance

Shuchao Pang, Zhenghan Chen, Shen Zhang et al.

ICCV 2025arXiv:2508.15650
2
citations
#11323

DepthSync: Diffusion Guidance-Based Depth Synchronization for Scale- and Geometry-Consistent Video Depth Estimation

Yue-Jiang Dong, Wang Zhao, Jiale Xu et al.

ICCV 2025arXiv:2507.01603
2
citations
#11324

PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Mahesh Bhosale, Abdul Wasi, Yuanhao Zhai et al.

ICCV 2025arXiv:2506.23440
2
citations
#11325

SDMatte: Grafting Diffusion Models for Interactive Matting

Longfei Huang, Yu Liang, Hao Zhang et al.

ICCV 2025arXiv:2508.00443
2
citations
#11326

Retinex-MEF: Retinex-based Glare Effects Aware Unsupervised Multi-Exposure Image Fusion

Haowen Bai, Jiangshe Zhang, Zixiang Zhao et al.

ICCV 2025arXiv:2503.07235
2
citations
#11327

When Schrödinger Bridge Meets Real-World Image Dehazing with Unpaired Training

Yunwei Lan, Zhigao Cui, Xin Luo et al.

ICCV 2025arXiv:2507.09524
2
citations
#11328

Hawaii: Hierarchical Visual Knowledge Transfer for Efficient Vision-Language Models

Yimu Wang, Mozhgan Nasr Azadani, Sean Sedwards et al.

NEURIPS 2025arXiv:2506.19072
2
citations
#11329

When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product

Youqi WU, Jingwei Zhang, Farzan Farnia

NEURIPS 2025arXiv:2506.08645
2
citations
#11330

Limitations of Normalization in Attention

Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.

NEURIPS 2025arXiv:2508.17821
2
citations
#11331

MatchDiffusion: Training-free Generation of Match-Cuts

Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.

ICCV 2025arXiv:2411.18677
2
citations
#11332

Information Theoretic Learning for Diffusion Models with Warm Start

Yirong Shen, Lu GAN, Cong Ling

NEURIPS 2025arXiv:2510.20903
2
citations
#11333

Resource-Constrained Federated Continual Learning: What Does Matter?

Yichen Li, Yuying Wang, Jiahua Dong et al.

NEURIPS 2025arXiv:2501.08737
2
citations
#11334

V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation

Hanyue Lou, Jinxiu Liang, Minggui Teng et al.

NEURIPS 2025oralarXiv:2505.16797
2
citations
#11335

IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features

Anand Kumar, Jiteng Mu, Nuno Vasconcelos

ICCV 2025arXiv:2412.14432
2
citations
#11336

AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models

Xinyi Wang, Xun Yang, Yanlong Xu et al.

NEURIPS 2025arXiv:2511.10017
2
citations
#11337

Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples

Suqin Yuan, Lei Feng, Bo Han et al.

NEURIPS 2025arXiv:2502.08227
2
citations
#11338

NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding

Wei Xu, Cheng Wang, Dingkang Liang et al.

NEURIPS 2025arXiv:2510.27481
2
citations
#11339

Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers

Nima Hosseini Dashtbayaz, Hesam Salehipour, Adrian Butscher et al.

NEURIPS 2025arXiv:2505.14595
2
citations
#11340

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

Xunpeng Yi, yibing zhang, Xinyu Xiang et al.

ICCV 2025arXiv:2509.00346
2
citations
#11341

The Promise of RL for Autoregressive Image Editing

Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.

NEURIPS 2025arXiv:2508.01119
2
citations
#11342

Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO

Kaiyang Guo, Yinchuan Li, Zhitang Chen

NEURIPS 2025arXiv:2505.23316
2
citations
#11343

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Zongqian Li, Yixuan Su, Nigel Collier

NEURIPS 2025arXiv:2505.09519
2
citations
#11344

Gradient Variance Reveals Failure Modes in Flow-Based Generative Models

Teodora Reu, Sixtine Dromigny, Michael Bronstein et al.

NEURIPS 2025spotlightarXiv:2510.18118
2
citations
#11345

Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool

Jiangtong Li, Dongyi Liu, Kun Zhu et al.

NEURIPS 2025arXiv:2412.17213
2
citations
#11346

SPARKE: Scalable Prompt-Aware Diversity and Novelty Guidance in Diffusion Models via RKE Score

Mohammad Jalali, Haoyu Lei, Amin Gohari et al.

NEURIPS 2025arXiv:2506.10173
2
citations
#11347

Reward-Aware Proto-Representations in Reinforcement Learning

Hon Tik Tse, Siddarth Chandrasekar, Marlos C. Machado

NEURIPS 2025oralarXiv:2505.16217
2
citations
#11348

Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling

LI XIAOJIE, Ronghui Li, Shukai Fang et al.

ICCV 2025arXiv:2507.14915
2
citations
#11349

TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence

Feng Jiang, Mangal Prakash, Hehuan Ma et al.

NEURIPS 2025spotlightarXiv:2506.21028
2
citations
#11350

Towards Self-Refinement of Vision-Language Models with Triangular Consistency

Yunlong Deng, Guangyi Chen, Tianpei Gu et al.

NEURIPS 2025arXiv:2510.10487
2
citations
#11351

A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking

Gal Fadlon, Idan Arbiv, Nimrod Berman et al.

NEURIPS 2025arXiv:2510.06699
2
citations
#11352

Towards Explicit Exoskeleton for the Reconstruction of Complicated 3D Human Avatars

Yifan Zhan, Qingtian Zhu, Muyao Niu et al.

ICCV 2025arXiv:2410.08082
2
citations
#11353

On-Device Diffusion Transformer Policy for Efficient Robot Manipulation

Yiming Wu, Huan Wang, Zhenghao Chen et al.

ICCV 2025arXiv:2508.00697
2
citations
#11354

Metric Convolutions: A Unifying Theory to Adaptive Image Convolutions

Thomas Dagès, Michael Lindenbaum, Alfred Bruckstein

ICCV 2025arXiv:2406.05400
2
citations
#11355

A duality framework for analyzing random feature and two-layer neural networks

Hongrui Chen, Jihao Long, Lei Wu

NEURIPS 2025arXiv:2305.05642
2
citations
#11356

IMPROVED LEARNING THEORY FOR KERNEL DISTRIBUTION REGRESSION WITH TWO-STAGE SAMPLING

Alberto González-Sanz, François Bachoc, Jean-Michel Loubes et al.

NEURIPS 2025arXiv:2308.14335
2
citations
#11357

AI Testing Should Account for Sophisticated Strategic Behaviour

Vojta Kovarik, Eric Chen, Sami Petersen et al.

NEURIPS 2025arXiv:2508.14927
2
citations
#11358

Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor

Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn et al.

NEURIPS 2025arXiv:2506.14652
2
citations
#11359

VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction

Martin de La Gorce, Charlie Hewitt, Tibor Takács et al.

ICCV 2025arXiv:2507.21311
2
citations
#11360

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

Ziheng Cheng, Yixiao Huang, Hui Xu et al.

NEURIPS 2025arXiv:2505.21347
2
citations
#11361

Struct-Bench: A Benchmark for Differentially Private Structured Text Generation

Shuaiqi Wang, Vikas Raunak, Arturs Backurs et al.

NEURIPS 2025arXiv:2509.10696
2
citations
#11362

Factorio Learning Environment

Jack Hopkins, Mart Bakler, Akbir Khan

NEURIPS 2025arXiv:2503.09617
2
citations
#11363

Dense Backpropagation Improves Training for Sparse Mixture-of-Experts

Ashwinee Panda, Vatsal Baherwani, Zain Sarwar et al.

NEURIPS 2025arXiv:2504.12463
2
citations
#11364

Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data

Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich

NEURIPS 2025arXiv:2504.14368
2
citations
#11365

Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition

Pulkit Kumar, Shuaiyi Huang, Matthew Walmer et al.

ICCV 2025arXiv:2508.03695
2
citations
#11366

DisenQ: Disentangling Q-Former for Activity-Biometrics

Shehreen Azad, Yogesh Rawat

ICCV 2025highlightarXiv:2507.07262
2
citations
#11367

QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks

Sk Tanzir Mehedi, Raja Jurdak, Chadni Islam et al.

NEURIPS 2025arXiv:2505.13804
2
citations
#11368

NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval

Zengrong Lin, Zheng Wang, Tianwen Qian et al.

CVPR 2025arXiv:2503.10526
2
citations
#11369

SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing

Heyi Sun, Cong Wang, Tian-Xing Xu et al.

ICCV 2025arXiv:2508.09597
2
citations
#11370

ChartCap: Mitigating Hallucination of Dense Chart Captioning

Junyoung Lim, Jaewoo Ahn, Gunhee Kim

ICCV 2025highlightarXiv:2508.03164
2
citations
#11371

AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark

Aruna Gauba, Irene Pi, Yunze Man et al.

NEURIPS 2025arXiv:2504.10568
2
citations
#11372

MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models

Hang Hua, Ziyun Zeng, Yizhi Song et al.

NEURIPS 2025arXiv:2505.19415
2
citations
#11373

Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment

Samuel (Min-Hsuan) Yeh, Sharon Li

NEURIPS 2025arXiv:2509.23564
2
citations
#11374

Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations

Tal Barami, Nimrod Berman, Ilan Naiman et al.

NEURIPS 2025arXiv:2510.17313
2
citations
#11375

CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning

Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.

NEURIPS 2025arXiv:2507.03707
2
citations
#11376

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research

A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen et al.

NEURIPS 2025oralarXiv:2412.06966
2
citations
#11377

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Tianhao Peng, Haochen Wang, Yuanxing Zhang et al.

NEURIPS 2025arXiv:2511.07250
2
citations
#11378

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.

ICCV 2025arXiv:2508.01984
2
citations
#11379

Blind2Sound: Self-Supervised Image Denoising without Residual Noise

Jiazheng Liu, Zejin Wang, Bohao Chen et al.

ICCV 2025arXiv:2303.05183
2
citations
#11380

Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts

Andrea Pugnana, Riccardo Massidda, Francesco Giannini et al.

NEURIPS 2025arXiv:2503.16199
2
citations
#11381

EngiBench: A Framework for Data-Driven Engineering Design Research

Florian Felten, Gabriel Apaza, Gerhard Bräunlich et al.

NEURIPS 2025arXiv:2508.00831
2
citations
#11382

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlightarXiv:2504.08727
2
citations
#11383

MTBBench: A Multimodal Sequential Clinical Decision-Making Benchmark in Oncology

Kiril Vasilev, Alexandre Misrahi, Eeshaan Jain et al.

NEURIPS 2025arXiv:2511.20490
2
citations
#11384

BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks

Anna Sokol, Elizabeth Daly, Michael Hind et al.

NEURIPS 2025arXiv:2410.12974
2
citations
#11385

Bridging the Skeleton-Text Modality Gap: Diffusion-Powered Modality Alignment for Zero-shot Skeleton-based Action Recognition

Jeonghyeok Do, Munchurl Kim

ICCV 2025arXiv:2411.10745
2
citations
#11386

PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image

Geonhee Sim, Gyeongsik Moon

ICCV 2025arXiv:2508.09973
2
citations
#11387

LIFEBENCH: Evaluating Length Instruction Following in Large Language Models

Wei Zhang, Zhenhong Zhou, Kun Wang et al.

NEURIPS 2025arXiv:2505.16234
2
citations
#11388

ExAct: A Video-Language Benchmark for Expert Action Analysis

Han Yi, Yulu Pan, Feihong He et al.

NEURIPS 2025arXiv:2506.06277
2
citations
#11389

BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model

Weilin Lin, Nanjun Zhou, Yanyun Wang et al.

NEURIPS 2025arXiv:2502.11798
2
citations
#11390

Measuring Fingerprints of Web-filtered Text Datasets and Fingerprint Propagation Through Training

Youssef Mansour, Reinhard Heckel

NEURIPS 2025spotlightarXiv:2412.02857
2
citations
#11391

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Valerii Startsev, Alexander Ustyuzhanin, Alexey Kirillov et al.

NEURIPS 2025arXiv:2505.19297
2
citations
#11392

Synchronization of Multiple Videos

Avihai Naaman, Ron Shapira Weber, Oren Freifeld

ICCV 2025arXiv:2510.14051
2
citations
#11393

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

Sebastian Joseph, Syed M. Husain, Stella Offner et al.

NEURIPS 2025arXiv:2505.20538
2
citations
#11394

What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning

Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha et al.

ICCV 2025arXiv:2503.21055
2
citations
#11395

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.

NEURIPS 2025arXiv:2507.15550
2
citations
#11396

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

Bingchen Zhao, Despoina Magka, Minqi Jiang et al.

NEURIPS 2025arXiv:2506.22419
2
citations
#11397

DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios

Yao Huang, Yitong Sun, Yichi Zhang et al.

NEURIPS 2025oralarXiv:2510.15501
2
citations
#11398

A Practical Guide for Incorporating Symmetry in Diffusion Policy

Dian Wang, Boce Hu, Shuran Song et al.

NEURIPS 2025arXiv:2505.13431
2
citations
#11399

SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

Ahmed Heakl, Yahia Salaheldin Shaaban, Salem Lahlou et al.

NEURIPS 2025arXiv:2505.21887
2
citations
#11400

SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference

Yi Zhao, Yajuan Peng, Nguyen Cam-Tu et al.

NEURIPS 2025spotlightarXiv:2508.02751
2
citations