Most Cited 2025 "causal perspective" Papers

22,274 papers found • Page 57 of 112

Filters:Most Cited 2025 causal perspective Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#11201

DictAS: A Framework for Class-Generalizable Few-Shot Anomaly Segmentation via Dictionary Lookup

Zhen Qu, Xian Tao, Xinyi Gong et al.

ICCV 2025arXiv:2508.13560

citations

#11202

MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling

Yingyue Li, Bencheng Liao, Wenyu Liu et al.

ICCV 2025arXiv:2503.13440

citations

#11203

LiON-LoRA: Rethinking LoRA Fusion to Unify Controllable Spatial and Temporal Generation for Video Diffusion

Yisu Zhang, Chenjie Cao, Chaohui Yu et al.

ICCV 2025arXiv:2507.05678

citations

#11204

InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes

Zesong Yang, Bangbang Yang, Wenqi Dong et al.

ICCV 2025arXiv:2507.08416

citations

#11205

JailbreakDiffBench: A Comprehensive Benchmark for Jailbreaking Diffusion Models

Xiaolong Jin, Zixuan Weng, Hanxi Guo et al.

ICCV 2025

citations

#11206

Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Seogkyu Jeon, Kibeom Hong, Hyeran Byun

ICCV 2025arXiv:2512.03508

citations

#11207

Semantic versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

Yuan Tian, Shuo Wang, Rongzhao Zhang et al.

ICCV 2025arXiv:2507.21703

citations

#11208

Supercharging Floorplan Localization with Semantic Rays

Yuval Grader, Hadar Averbuch-Elor

ICCV 2025arXiv:2507.09291

citations

#11209

G2SF: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection

Chengyu Tao, Xuanming Cao, Juan Du

ICCV 2025

citations

#11210

One Polyp Identifies All: One-Shot Polyp Segmentation with SAM via Cascaded Priors and Iterative Prompt Evolution

Xinyu Mao, Xiaohan Xing, Fei MENG et al.

ICCV 2025arXiv:2507.16337

citations

#11211

You Are Your Own Best Teacher: Achieving Centralized-level Performance in Federated Learning under Heterogeneous and Long-tailed Data

Shanshan Yan, Zexi Li, Chao Wu et al.

ICCV 2025arXiv:2503.06916

citations

#11212

GT-Mean Loss: A Simple Yet Effective Solution for Brightness Mismatch in Low-Light Image Enhancement

Jingxi Liao, Shijie Hao, Richang Hong et al.

ICCV 2025arXiv:2507.20148

citations

#11213

Learn2Synth: Learning Optimal Data Synthesis Using Hypergradients for Brain Image Segmentation

Xiaoling Hu, Xiangrui Zeng, Oula Puonti et al.

ICCV 2025arXiv:2411.16719

citations

#11214

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

Sanghyun Jo, Seo Lee, Seungwoo Lee et al.

ICCV 2025arXiv:2503.11439

citations

#11215

Learning Deblurring Texture Prior from Unpaired Data with Diffusion Model

Chengxu Liu, Lu Qi, Jinshan Pan et al.

ICCV 2025arXiv:2507.13599

citations

#11216

Identity-aware Language Gaussian Splatting for Open-vocabulary 3D Semantic Segmentation

SungMin Jang, Wonjun Kim

ICCV 2025

citations

#11217

Structure-aware Semantic Discrepancy and Consistency for 3D Medical Image Self-supervised Learning

Tan Pan, Zhaorui Tan, Kaiyu Guo et al.

ICCV 2025arXiv:2507.02581

citations

#11218

DASH: 4D Hash Encoding with Self-Supervised Decomposition for Real-Time Dynamic Scene Rendering

Jie Chen, Zhangchi Hu, Peixi Wu et al.

ICCV 2025arXiv:2507.19141

citations

#11219

EVT: Efficient View Transformation for Multi-Modal 3D Object Detection

Yongjin Lee, Hyeon-Mun Jeong, Yurim Jeon et al.

ICCV 2025arXiv:2411.10715

citations

#11220

An Inversion-based Measure of Memorization for Diffusion Models

Zhe Ma, Qingming Li, Xuhong Zhang et al.

ICCV 2025arXiv:2405.05846

citations

#11221

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications

Omkar Thawakar, Dmitry Demidov, Ritesh Thawkar et al.

ICCV 2025arXiv:2508.14039

citations

#11222

Balanced Sharpness-Aware Minimization for Imbalanced Regression

Yahao Liu, Qin Wang, Lixin Duan et al.

ICCV 2025arXiv:2508.16973

citations

#11223

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

Qihui Zhang, Munan Ning, Zheyuan Liu et al.

CVPR 2025arXiv:2503.14941

citations

#11224

FICGen: Frequency-Inspired Contextual Disentanglement for Layout-driven Degraded Image Generation

Wenzhuang Wang, Yifan Zhao, Mingcan Ma et al.

ICCV 2025arXiv:2509.01107

citations

#11225

GLEAM: Enhanced Transferable Adversarial Attacks for Vision-Language Pre-training Models via Global-Local Transformations

Yunqi Liu, Xiaohui Cui, Ouyang Xue

ICCV 2025

citations

#11226

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

Yong Liu, Song-Li Wu, Sule Bai et al.

ICCV 2025arXiv:2506.16058

citations

#11227

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

JIAHE ZHAO, RuiBing Hou, zejie tian et al.

ICCV 2025arXiv:2503.12955

citations

#11228

Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining

Qi Fan, Kaiqi Liu, Nian Liu et al.

ICCV 2025arXiv:2504.21414

citations

#11229

DOGR: Towards Versatile Visual Document Grounding and Referring

Yinan Zhou, Yuxin Chen, Haokun Lin et al.

ICCV 2025arXiv:2411.17125

citations

#11230

ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition

Sanjoy Kundu, Shanmukha Vellamcheti, Sathyanarayanan Aakur

ICCV 2025arXiv:2504.03948

citations

#11231

Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion

Enyu Liu, En Yu, Sijia Chen et al.

ICCV 2025arXiv:2507.08555

citations

#11232

Adaptive Routing of Text-to-Image Generation Requests Between Large Cloud Model and Light-Weight Edge Model

Zewei Xin, Qinya Li, Chaoyue Niu et al.

ICCV 2025arXiv:2411.13787

citations

#11233

AnyPortal: Zero-Shot Consistent Video Background Replacement

Wenshuo Gao, Xicheng Lan, Shuai Yang

ICCV 2025arXiv:2509.07472

citations

#11234

LC-Mamba: Local and Continuous Mamba with Shifted Windows for Frame Interpolation

Min Wu Jeong, Chae Eun Rhee

CVPR 2025

citations

#11235

SciVid: Cross-Domain Evaluation of Video Models in Scientific Applications

Yana Hasson, Pauline Luc, Liliane Momeni et al.

ICCV 2025arXiv:2507.03578

citations

#11236

VSC: Visual Search Compositional Text-to-Image Diffusion Model

Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.

ICCV 2025arXiv:2505.01104

citations

#11237

Counting Stacked Objects

Corentin Dumery, Noa Ette, Aoxiang Fan et al.

ICCV 2025arXiv:2411.19149

citations

#11238

Bi-Level Optimization for Self-Supervised AI-Generated Face Detection

Mian Zou, Nan Zhong, Baosheng Yu et al.

ICCV 2025highlightarXiv:2507.22824

citations

#11239

DCT-Shield: A Robust Frequency Domain Defense against Malicious Image Editing

Aniruddha Bala, Rohit Chowdhury, Rohan Jaiswal et al.

ICCV 2025highlightarXiv:2504.17894

citations

#11240

MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances

Yunzhe Shao, Xinyu Yi, Lu Yin et al.

ICCV 2025arXiv:2506.22907

citations

#11241

Towards Cross-modal Backward-compatible Representation Learning for Vision-Language Models

Young Kyun Jang, Ser-Nam Lim

ICCV 2025arXiv:2405.14715

citations

#11242

LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing

Achint Soni, Meet Soni, Sirisha Rambhatla

ICCV 2025arXiv:2503.21541

citations

#11243

From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning

Yuhui Zeng, Haoxiang Wu, Wenjie Nie et al.

ICCV 2025arXiv:2502.05843

citations

#11244

CarGait: Cross-Attention based Re-ranking for Gait recognition

Gavriel Habib, Noa Barzilay, Or Shimshi et al.

ICCV 2025arXiv:2503.03501

citations

#11245

PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior

Seunggwan Lee, Hwanhee Jung, ByoungSoo Koh et al.

ICCV 2025arXiv:2503.12834

citations

#11246

Preacher: Paper-to-Video Agentic System

Jingwei Liu, Ling Yang, Hao Luo et al.

ICCV 2025arXiv:2508.09632

citations

#11247

MR-FIQA: Face Image Quality Assessment with Multi-Reference Representations from Synthetic Data Generation

Fu-Zhao Ou, Chongyi Li, Shiqi Wang et al.

ICCV 2025

citations

#11248

Gait-X: Exploring X modality for Generalized Gait Recognition

Zengbin Wang, Saihui Hou, Junjie Li et al.

ICCV 2025

citations

#11249

Adaptive Articulated Object Manipulation On The Fly with Foundation Model Reasoning and Part Grounding

Xiaojie Zhang, Yuanfei Wang, Ruihai Wu et al.

ICCV 2025arXiv:2507.18276

citations

#11250

Do Your Best and Get Enough Rest for Continual Learning

Hankyul Kang, Gregor Seifer, Donghyun Lee et al.

CVPR 2025arXiv:2503.18371

citations

#11251

Robust 3D-Masked Part-level Editing in 3D Gaussian Splatting with Regularized Score Distillation Sampling

Hayeon Kim, Ji Ha Jang, Se Young Chun

ICCV 2025arXiv:2507.11061

citations

#11252

VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data

Jian Shi, Peter Wonka

ICCV 2025arXiv:2312.08871

citations

#11253

Seeing the Trees for the Forest: Rethinking Weakly-Supervised Medical Visual Grounding

Huy Ta, Duy Anh Huynh, Yutong Xie et al.

ICCV 2025highlightarXiv:2505.15123

citations

#11254

LoRAverse: A Submodular Framework to Retrieve Diverse Adapters for Diffusion Models

Mert Sonmezer, Matthew Zheng, Pinar Yanardag

ICCV 2025arXiv:2510.15022

citations

#11255

Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation

Guanyi Qin, Ziyue Wang, Daiyun Shen et al.

ICCV 2025highlightarXiv:2507.18944

citations

#11256

Autoregressive Denoising Score Matching is a Good Video Anomaly Detector

hanwen Zhang, Congqi Cao, Qinyi Lv et al.

ICCV 2025arXiv:2506.23282

citations

#11257

SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior

Bo Zhao, Haoran Wang, Jinghui Wang et al.

ICCV 2025highlightarXiv:2510.15749

citations

#11258

RobustSplat: Decoupling Densification and Dynamics for Transient-Free 3DGS

Chuanyu Fu, Yuqi Zhang, Kunbin Yao et al.

ICCV 2025arXiv:2506.02751

citations

#11259

Denoising Token Prediction in Masked Autoregressive Models

Ting Yao, Yehao Li, Yingwei Pan et al.

ICCV 2025

citations

#11260

Consistency Trajectory Matching for One-Step Generative Super-Resolution

Weiyi You, Mingyang Zhang, Leheng Zhang et al.

ICCV 2025arXiv:2503.20349

citations

#11261

LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping

Pascal Chang, Sergio Sancho, Jingwei Tang et al.

CVPR 2025arXiv:2504.08902

citations

#11262

Supercharged One-step Text-to-Image Diffusion Models with Negative Prompts

Viet Nguyen, Anh Nguyen, Trung Dao et al.

ICCV 2025arXiv:2412.02687

citations

#11263

CVPT: Cross Visual Prompt Tuning

Lingyun Huang, Jianxu Mao, Junfei YI et al.

ICCV 2025arXiv:2408.14961

citations

#11264

Addressing Text Embedding Leakage in Diffusion-based Image Editing

Sunung Mun, Jinhwan Nam, Sunghyun Cho et al.

ICCV 2025arXiv:2412.04715

citations

#11265

InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

Tao Han, Wanghan Xu, Junchao Gong et al.

ICCV 2025arXiv:2509.10441

citations

#11266

Efficient Multi-Person Motion Prediction by Lightweight Spatial and Temporal Interactions

Yuanhong Zheng, Ruixuan Yu, Jian Sun

ICCV 2025arXiv:2507.09446

citations

#11267

FedMVP: Federated Multimodal Visual Prompt Tuning for Vision-Language Models

Mainak Singha, Subhankar Roy, Sarthak Mehrotra et al.

ICCV 2025arXiv:2504.20860

citations

#11268

CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models

Junho Kim, Hyungjin Chung, Byung-Hoon Kim

ICCV 2025arXiv:2411.06869

citations

#11269

Subjective Camera 1.0: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion

Haoyang Chen, Dongfang Sun, Caoyuan Ma et al.

ICCV 2025arXiv:2506.23711

citations

#11270

Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions

Liang Xu, Chengqun Yang, Zili Lin et al.

ICCV 2025arXiv:2508.04681

citations

#11271

ViLU: Learning Vision-Language Uncertainties for Failure Prediction

Marc Lafon, Yannis Karmim, Julio Silva-Rodríguez et al.

ICCV 2025arXiv:2507.07620

citations

#11272

High-Fidelity Lightweight Mesh Reconstruction from Point Clouds

Chen Zhang, Wentao Wang, Ximeng Li et al.

CVPR 2025highlight

citations

#11273

HouseTour: A Virtual Real Estate A(I)gent

Ata Çelen, Iro Armeni, Daniel Barath et al.

ICCV 2025arXiv:2510.18054

citations

#11274

Dynamic-DINO: Fine-Grained Mixture of Experts Tuning for Real-time Open-Vocabulary Object Detection

Yehao Lu, Minghe Weng, Zekang Xiao et al.

ICCV 2025arXiv:2507.17436

citations

#11275

SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting

Zihui Gao, Jia-Wang Bian, Guosheng Lin et al.

ICCV 2025arXiv:2507.15602

citations

#11276

Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces

Aniruddha Mahapatra, Long Mai, David Bourgin et al.

ICCV 2025arXiv:2501.05442

citations

#11277

Enhancing Transformers Through Conditioned Embedded Tokens

Hemanth Saratchandran, Simon Lucey

ICCV 2025arXiv:2505.12789

citations

#11278

Joint Asymmetric Loss for Learning with Noisy Labels

Jialiang Wang, Xianming Liu, Xiong Zhou et al.

ICCV 2025arXiv:2507.17692

citations

#11279

Trade-offs in Image Generation: How Do Different Dimensions Interact?

Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.

ICCV 2025arXiv:2507.22100

citations

#11280

LOTA: Bit-Planes Guided AI-Generated Image Detection

Renxi Cheng, Hongsong Wang, Yang Zhang et al.

ICCV 2025arXiv:2510.14230

citations

#11281

Minimal Variance Model Aggregation: A principled, non-intrusive, and versatile integration of black box models

Theo Bourdais, Houman Owhadi

ICLR 2025arXiv:2409.17267

citations

#11282

DNF-Intrinsic: Deterministic Noise-Free Diffusion for Indoor Inverse Rendering

Rongjia Zheng, Qing Zhang, Chengjiang Long et al.

ICCV 2025arXiv:2507.03924

citations

#11283

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

Hyungjin Kim, Seokho Ahn, Young-Duk Seo

ICCV 2025arXiv:2508.03481

citations

#11284

Using Powerful Prior Knowledge of Diffusion Model in Deep Unfolding Networks for Image Compressive Sensing

Chen Liao, Yan Shen, Dan Li et al.

CVPR 2025arXiv:2503.08429

citations

#11285

Stylized-Face: A Million-level Stylized Face Dataset for Face Recognition

Zhengyuan Peng, Jianqing Xu, Yuge Huang et al.

ICCV 2025

citations

#11286

MoSiC: Optimal-Transport Motion Trajectory for Dense Self-Supervised Learning

Mohammadreza Salehi, Shashanka Venkataramanan, Ioana Simion et al.

ICCV 2025arXiv:2506.08694

citations

#11287

Holistic Tokenizer for Autoregressive Image Generation

Anlin Zheng, Haochen Wang, Yucheng Zhao et al.

ICCV 2025arXiv:2507.02358

citations

#11288

PLA: Prompt Learning Attack against Text-to-Image Generative Models

XINQI LYU, Yihao LIU, Yanjie Li et al.

ICCV 2025arXiv:2508.03696

citations

#11289

MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective

Weitian Wang, Shubham rai, Cecilia De la Parra et al.

ICCV 2025arXiv:2507.19131

citations

#11290

CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation

Xiao Lin, Yun Peng, Liuyi Wang et al.

ICCV 2025arXiv:2502.01312

citations

#11291

LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering

Xiaohang Zhan, Dingming Liu

ICCV 2025arXiv:2508.07647

citations

#11292

Generalized Few-Shot Point Cloud Segmentation via LLM-Assisted Hyper-Relation Matching

Zhaoyang Li, Yuan Wang, Guoxin Xiong et al.

ICCV 2025

citations

#11293

MIRACLE 3D: Memory-efficient Integrated Robust Approach for Continual Learning on 3D Point Clouds via Shape Model Construction

Hossein Resani, Behrooz Nasihatkon

ICLR 2025

citations

#11294

Monocular Facial Appearance Capture in the Wild

Yingyan Xu, Kate Gadola, Prashanth Chandran et al.

ICCV 2025arXiv:2412.12765

citations

#11295

AMD: Adaptive Momentum and Decoupled Contrastive Learning Framework for Robust Long-Tail Trajectory Prediction

Bin Rao, Haicheng Liao, Yanchen Guan et al.

ICCV 2025arXiv:2507.01801

citations

#11296

Deep Incomplete Multi-view Clustering with Distribution Dual-Consistency Recovery Guidance

Jiaqi Jin, Siwei Wang, Zhibin Dong et al.

ICCV 2025arXiv:2503.11017

citations

#11297

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Ziwei Wang, Sameera Ramasinghe, Chenchen Xu et al.

ICCV 2025arXiv:2411.17490

citations

#11298

D2ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition

Wenjie Pei, Qizhong Tan, Guangming Lu et al.

ICCV 2025

citations

#11299

FreeUV: Ground-Truth-Free Realistic Facial UV Texture Recovery via Cross-Assembly Inference Strategy

Xingchao Yang, Takafumi Taketomi, Yuki Endo et al.

CVPR 2025arXiv:2503.17197

citations

#11300

DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Revant Teotia, Candace Ross, Karen Ullrich et al.

ICCV 2025arXiv:2506.05108

citations

#11301

DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing

Yang JingYi, Xun Lin, Zitong YU et al.

ICCV 2025arXiv:2503.00429

citations

#11302

Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization

Bingqing Zhang, Zhuo Cao, Heming Du et al.

ICCV 2025arXiv:2507.15504

citations

#11303

Timestep-Aware Diffusion Model for Extreme Image Rescaling

Ce Wang, Zhenyu Hu, Wanjie Sun et al.

ICCV 2025arXiv:2408.09151

citations

#11304

Adversarial Attention Perturbations for Large Object Detection Transformers

Zachary Yahn, Selim Tekin, Fatih Ilhan et al.

ICCV 2025arXiv:2508.02987

citations

#11305

HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image

Junyi Guo, Jingxuan Zhang, Fangyu Wu et al.

ICCV 2025arXiv:2505.23186

citations

#11306

What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models

Lorenzo Baraldi, Davide Bucciarelli, Federico Betti et al.

ICCV 2025arXiv:2505.20405

citations

#11307

DALIP: Distribution Alignment-based Language-Image Pre-Training for Domain-Specific Data

Junjie Wu, Jiangtao Xie, Zhaolin Zhang et al.

ICCV 2025arXiv:2504.01386

citations

#11308

Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation

Siyu Chen, Ting Han, Changshe Zhang et al.

ICCV 2025arXiv:2504.12753

citations

#11309

Aligning Constraint Generation with Design Intent in Parametric CAD

Evan Casey, Tianyu Zhang, Shu Ishida et al.

ICCV 2025arXiv:2504.13178

citations

#11310

CCL-LGS: Contrastive Codebook Learning for 3D Language Gaussian Splatting

Lei Tian, Xiaomin Li, Liqian Ma et al.

ICCV 2025arXiv:2505.20469

citations

#11311

SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos et al.

ICCV 2025arXiv:2502.06593

citations

#11312

MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion

Fei Peng, Junqiang Wu, Yan Li et al.

ICCV 2025arXiv:2508.14440

citations

#11313

StyleKeeper: Prevent Content Leakage using Negative Visual Query Guidance

Jaeseok Jeong, Junho Kim, Youngjung Uh et al.

ICCV 2025arXiv:2510.06827

citations

#11314

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

Zeyi Sun, Tong Wu, Pan Zhang et al.

ICCV 2025arXiv:2406.00093

citations

#11315

MoGA: 3D Generative Avatar Prior for Monocular Gaussian Avatar Reconstruction

Zijian Dong, Longteng Duan, Jie Song et al.

ICCV 2025highlightarXiv:2507.23597

citations

#11316

MotionShot: Adaptive Motion Transfer across Arbitrary Objects for Text-to-Video Generation

Yanchen Liu, Yanan SUN, Zhening Xing et al.

ICCV 2025arXiv:2507.16310

citations

#11317

GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding

Zijun Lin, Shuting He, Cheston Tan et al.

ICCV 2025arXiv:2506.21188

citations

#11318

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Huaiyuan Qin et al.

ICCV 2025highlightarXiv:2507.12857

citations

#11319

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions

Bimsara Pathiraja, Maitreya Patel, Shivam Singh et al.

ICCV 2025arXiv:2506.03448

citations

#11320

InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild

Yiyi Ma, Yuanzhi Liang, Xiu Li et al.

ICCV 2025arXiv:2508.10297

citations

#11321

SPADE: Spatial-Aware Denoising Network for Open-vocabulary Panoptic Scene Graph Generation with Long- and Local-range Context Reasoning

XIN Hu, Ke Qin, Guiduo Duan et al.

ICCV 2025arXiv:2507.05798

citations

#11322

Towards a 3D Transfer-based Black-box Attack via Critical Feature Guidance

Shuchao Pang, Zhenghan Chen, Shen Zhang et al.

ICCV 2025arXiv:2508.15650

citations

#11323

DepthSync: Diffusion Guidance-Based Depth Synchronization for Scale- and Geometry-Consistent Video Depth Estimation

Yue-Jiang Dong, Wang Zhao, Jiale Xu et al.

ICCV 2025arXiv:2507.01603

citations

#11324

PathDiff: Histopathology Image Synthesis with Unpaired Text and Mask Conditions

Mahesh Bhosale, Abdul Wasi, Yuanhao Zhai et al.

ICCV 2025arXiv:2506.23440

citations

#11325

SDMatte: Grafting Diffusion Models for Interactive Matting

Longfei Huang, Yu Liang, Hao Zhang et al.

ICCV 2025arXiv:2508.00443

citations

#11326

Retinex-MEF: Retinex-based Glare Effects Aware Unsupervised Multi-Exposure Image Fusion

Haowen Bai, Jiangshe Zhang, Zixiang Zhao et al.

ICCV 2025arXiv:2503.07235

citations

#11327

When Schrödinger Bridge Meets Real-World Image Dehazing with Unpaired Training

Yunwei Lan, Zhigao Cui, Xin Luo et al.

ICCV 2025arXiv:2507.09524

citations

#11328

Hawaii: Hierarchical Visual Knowledge Transfer for Efficient Vision-Language Models

Yimu Wang, Mozhgan Nasr Azadani, Sean Sedwards et al.

NEURIPS 2025arXiv:2506.19072

citations

#11329

When Kernels Multiply, Clusters Unify: Fusing Embeddings with the Kronecker Product

Youqi WU, Jingwei Zhang, Farzan Farnia

NEURIPS 2025arXiv:2506.08645

citations

#11330

Limitations of Normalization in Attention

Timur Mudarisov, Mikhail Burtsev, Tatiana Petrova et al.

NEURIPS 2025arXiv:2508.17821

citations

#11331

MatchDiffusion: Training-free Generation of Match-Cuts

Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.

ICCV 2025arXiv:2411.18677

citations

#11332

Information Theoretic Learning for Diffusion Models with Warm Start

Yirong Shen, Lu GAN, Cong Ling

NEURIPS 2025arXiv:2510.20903

citations

#11333

Resource-Constrained Federated Continual Learning: What Does Matter?

Yichen Li, Yuying Wang, Jiahua Dong et al.

NEURIPS 2025arXiv:2501.08737

citations

#11334

V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation

Hanyue Lou, Jinxiu Liang, Minggui Teng et al.

NEURIPS 2025oralarXiv:2505.16797

citations

#11335

IntroStyle: Training-Free Introspective Style Attribution using Diffusion Features

Anand Kumar, Jiteng Mu, Nuno Vasconcelos

ICCV 2025arXiv:2412.14432

citations

#11336

AffordBot: 3D Fine-grained Embodied Reasoning via Multimodal Large Language Models

Xinyi Wang, Xun Yang, Yanlong Xu et al.

NEURIPS 2025arXiv:2511.10017

citations

#11337

Enhancing Sample Selection Against Label Noise by Cutting Mislabeled Easy Examples

Suqin Yuan, Lei Feng, Bo Han et al.

NEURIPS 2025arXiv:2502.08227

citations

#11338

NAUTILUS: A Large Multimodal Model for Underwater Scene Understanding

Wei Xu, Cheng Wang, Dingkang Liang et al.

NEURIPS 2025arXiv:2510.27481

citations

#11339

Physics-informed Reduced Order Modeling of Time-dependent PDEs via Differentiable Solvers

Nima Hosseini Dashtbayaz, Hesam Salehipour, Adrian Butscher et al.

NEURIPS 2025arXiv:2505.14595

citations

#11340

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

Xunpeng Yi, yibing zhang, Xinyu Xiang et al.

ICCV 2025arXiv:2509.00346

citations

#11341

The Promise of RL for Autoregressive Image Editing

Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.

NEURIPS 2025arXiv:2508.01119

citations

#11342

Proximalized Preference Optimization for Diverse Feedback Types: A Decomposed Perspective on DPO

Kaiyang Guo, Yinchuan Li, Zhitang Chen

NEURIPS 2025arXiv:2505.23316

citations

#11343

PT-MoE: An Efficient Finetuning Framework for Integrating Mixture-of-Experts into Prompt Tuning

Zongqian Li, Yixuan Su, Nigel Collier

NEURIPS 2025arXiv:2505.09519

citations

#11344

Gradient Variance Reveals Failure Modes in Flow-Based Generative Models

Teodora Reu, Sixtine Dromigny, Michael Bronstein et al.

NEURIPS 2025spotlightarXiv:2510.18118

citations

#11345

Attack by Yourself: Effective and Unnoticeable Multi-Category Graph Backdoor Attacks with Subgraph Triggers Pool

Jiangtong Li, Dongyi Liu, Kun Zhu et al.

NEURIPS 2025arXiv:2412.17213

citations

#11346

SPARKE: Scalable Prompt-Aware Diversity and Novelty Guidance in Diffusion Models via RKE Score

Mohammad Jalali, Haoyu Lei, Amin Gohari et al.

NEURIPS 2025arXiv:2506.10173

citations

#11347

Reward-Aware Proto-Representations in Reinforcement Learning

Hon Tik Tse, Siddarth Chandrasekar, Marlos C. Machado

NEURIPS 2025oralarXiv:2505.16217

citations

#11348

Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling

LI XIAOJIE, Ronghui Li, Shukai Fang et al.

ICCV 2025arXiv:2507.14915

citations

#11349

TRIDENT: Tri-Modal Molecular Representation Learning with Taxonomic Annotations and Local Correspondence

Feng Jiang, Mangal Prakash, Hehuan Ma et al.

NEURIPS 2025spotlightarXiv:2506.21028

citations

#11350

Towards Self-Refinement of Vision-Language Models with Triangular Consistency

Yunlong Deng, Guangyi Chen, Tianpei Gu et al.

NEURIPS 2025arXiv:2510.10487

citations

#11351

A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking

Gal Fadlon, Idan Arbiv, Nimrod Berman et al.

NEURIPS 2025arXiv:2510.06699

citations

#11352

Towards Explicit Exoskeleton for the Reconstruction of Complicated 3D Human Avatars

Yifan Zhan, Qingtian Zhu, Muyao Niu et al.

ICCV 2025arXiv:2410.08082

citations

#11353

On-Device Diffusion Transformer Policy for Efficient Robot Manipulation

Yiming Wu, Huan Wang, Zhenghao Chen et al.

ICCV 2025arXiv:2508.00697

citations

#11354

Metric Convolutions: A Unifying Theory to Adaptive Image Convolutions

Thomas Dagès, Michael Lindenbaum, Alfred Bruckstein

ICCV 2025arXiv:2406.05400

citations

#11355

A duality framework for analyzing random feature and two-layer neural networks

Hongrui Chen, Jihao Long, Lei Wu

NEURIPS 2025arXiv:2305.05642

citations

#11356

IMPROVED LEARNING THEORY FOR KERNEL DISTRIBUTION REGRESSION WITH TWO-STAGE SAMPLING

Alberto González-Sanz, François Bachoc, Jean-Michel Loubes et al.

NEURIPS 2025arXiv:2308.14335

citations

#11357

AI Testing Should Account for Sophisticated Strategic Behaviour

Vojta Kovarik, Eric Chen, Sami Petersen et al.

NEURIPS 2025arXiv:2508.14927

citations

#11358

Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor

Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn et al.

NEURIPS 2025arXiv:2506.14652

citations

#11359

VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction

Martin de La Gorce, Charlie Hewitt, Tibor Takács et al.

ICCV 2025arXiv:2507.21311

citations

#11360

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

Ziheng Cheng, Yixiao Huang, Hui Xu et al.

NEURIPS 2025arXiv:2505.21347

citations

#11361

Struct-Bench: A Benchmark for Differentially Private Structured Text Generation

Shuaiqi Wang, Vikas Raunak, Arturs Backurs et al.

NEURIPS 2025arXiv:2509.10696

citations

#11362

Factorio Learning Environment

Jack Hopkins, Mart Bakler, Akbir Khan

NEURIPS 2025arXiv:2503.09617

citations

#11363

Dense Backpropagation Improves Training for Sparse Mixture-of-Experts

Ashwinee Panda, Vatsal Baherwani, Zain Sarwar et al.

NEURIPS 2025arXiv:2504.12463

citations

#11364

Do You Really Need Public Data? Surrogate Public Data for Differential Privacy on Tabular Data

Shlomi Hod, Lucas Rosenblatt, Julia Stoyanovich

NEURIPS 2025arXiv:2504.14368

citations

#11365

Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition

Pulkit Kumar, Shuaiyi Huang, Matthew Walmer et al.

ICCV 2025arXiv:2508.03695

citations

#11366

DisenQ: Disentangling Q-Former for Activity-Biometrics

Shehreen Azad, Yogesh Rawat

ICCV 2025highlightarXiv:2507.07262

citations

#11367

QUT-DV25: A Dataset for Dynamic Analysis of Next-Gen Software Supply Chain Attacks

Sk Tanzir Mehedi, Raja Jurdak, Chadni Islam et al.

NEURIPS 2025arXiv:2505.13804

citations

#11368

NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval

Zengrong Lin, Zheng Wang, Tianwen Qian et al.

CVPR 2025arXiv:2503.10526

citations

#11369

SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing

Heyi Sun, Cong Wang, Tian-Xing Xu et al.

ICCV 2025arXiv:2508.09597

citations

#11370

ChartCap: Mitigating Hallucination of Dense Chart Captioning

Junyoung Lim, Jaewoo Ahn, Gunhee Kim

ICCV 2025highlightarXiv:2508.03164

citations

#11371

AgMMU: A Comprehensive Agricultural Multimodal Understanding Benchmark

Aruna Gauba, Irene Pi, Yunze Man et al.

NEURIPS 2025arXiv:2504.10568

citations

#11372

MMIG-Bench: Towards Comprehensive and Explainable Evaluation of Multi-Modal Image Generation Models

Hang Hua, Ziyun Zeng, Yizhi Song et al.

NEURIPS 2025arXiv:2505.19415

citations

#11373

Clean First, Align Later: Benchmarking Preference Data Cleaning for Reliable LLM Alignment

Samuel (Min-Hsuan) Yeh, Sharon Li

NEURIPS 2025arXiv:2509.23564

citations

#11374

Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations

Tal Barami, Nimrod Berman, Ilan Naiman et al.

NEURIPS 2025arXiv:2510.17313

citations

#11375

CosmoBench: A Multiscale, Multiview, Multitask Cosmology Benchmark for Geometric Deep Learning

Teresa Huang, Richard Stiskalek, Jun-Young Lee et al.

NEURIPS 2025arXiv:2507.03707

citations

#11376

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research

A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen et al.

NEURIPS 2025oralarXiv:2412.06966

citations

#11377

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Tianhao Peng, Haochen Wang, Yuanxing Zhang et al.

NEURIPS 2025arXiv:2511.07250

citations

#11378

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.

ICCV 2025arXiv:2508.01984

citations

#11379

Blind2Sound: Self-Supervised Image Denoising without Residual Noise

Jiazheng Liu, Zejin Wang, Bohao Chen et al.

ICCV 2025arXiv:2303.05183

citations

#11380

Deferring Concept Bottleneck Models: Learning to Defer Interventions to Inaccurate Experts

Andrea Pugnana, Riccardo Massidda, Francesco Giannini et al.

NEURIPS 2025arXiv:2503.16199

citations

#11381

EngiBench: A Framework for Data-Driven Engineering Design Research

Florian Felten, Gabriel Apaza, Gerhard Bräunlich et al.

NEURIPS 2025arXiv:2508.00831

citations

#11382

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlightarXiv:2504.08727

citations

#11383

MTBBench: A Multimodal Sequential Clinical Decision-Making Benchmark in Oncology

Kiril Vasilev, Alexandre Misrahi, Eeshaan Jain et al.

NEURIPS 2025arXiv:2511.20490

citations

#11384

BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks

Anna Sokol, Elizabeth Daly, Michael Hind et al.

NEURIPS 2025arXiv:2410.12974

citations

#11385

Bridging the Skeleton-Text Modality Gap: Diffusion-Powered Modality Alignment for Zero-shot Skeleton-based Action Recognition

Jeonghyeok Do, Munchurl Kim

ICCV 2025arXiv:2411.10745

citations

#11386

PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image

Geonhee Sim, Gyeongsik Moon

ICCV 2025arXiv:2508.09973

citations

#11387

LIFEBENCH: Evaluating Length Instruction Following in Large Language Models

Wei Zhang, Zhenhong Zhou, Kun Wang et al.

NEURIPS 2025arXiv:2505.16234

citations

#11388

ExAct: A Video-Language Benchmark for Expert Action Analysis

Han Yi, Yulu Pan, Feihong He et al.

NEURIPS 2025arXiv:2506.06277

citations

#11389

BackdoorDM: A Comprehensive Benchmark for Backdoor Learning on Diffusion Model

Weilin Lin, Nanjun Zhou, Yanyun Wang et al.

NEURIPS 2025arXiv:2502.11798

citations

#11390

Measuring Fingerprints of Web-filtered Text Datasets and Fingerprint Propagation Through Training

Youssef Mansour, Reinhard Heckel

NEURIPS 2025spotlightarXiv:2412.02857

citations

#11391

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Valerii Startsev, Alexander Ustyuzhanin, Alexey Kirillov et al.

NEURIPS 2025arXiv:2505.19297

citations

#11392

Synchronization of Multiple Videos

Avihai Naaman, Ron Shapira Weber, Oren Freifeld

ICCV 2025arXiv:2510.14051

citations

#11393

AstroVisBench: A Code Benchmark for Scientific Computing and Visualization in Astronomy

Sebastian Joseph, Syed M. Husain, Stella Offner et al.

NEURIPS 2025arXiv:2505.20538

citations

#11394

What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning

Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha et al.

ICCV 2025arXiv:2503.21055

citations

#11395

PhysGym: Benchmarking LLMs in Interactive Physics Discovery with Controlled Priors

Yimeng Chen, Piotr Piękos, Mateusz Ostaszewski et al.

NEURIPS 2025arXiv:2507.15550

citations

#11396

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

Bingchen Zhao, Despoina Magka, Minqi Jiang et al.

NEURIPS 2025arXiv:2506.22419

citations

#11397

DeceptionBench: A Comprehensive Benchmark for AI Deception Behaviors in Real-world Scenarios

Yao Huang, Yitong Sun, Yichi Zhang et al.

NEURIPS 2025oralarXiv:2510.15501

citations

#11398

A Practical Guide for Incorporating Symmetry in Diffusion Policy

Dian Wang, Boce Hu, Shuran Song et al.

NEURIPS 2025arXiv:2505.13431

citations

#11399

SVRPBench: A Realistic Benchmark for Stochastic Vehicle Routing Problem

Ahmed Heakl, Yahia Salaheldin Shaaban, Salem Lahlou et al.

NEURIPS 2025arXiv:2505.21887

citations

#11400

SmallKV: Small Model Assisted Compensation of KV Cache Compression for Efficient LLM Inference

Yi Zhao, Yajuan Peng, Nguyen Cam-Tu et al.

NEURIPS 2025spotlightarXiv:2508.02751

citations

← Previous

1...55 56 57 58 59...112