2025 Highlight Papers

Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction

Edgar Sucar, Zihang Lai, Eldar Insafutdinov et al.

DynFaceRestore: Balancing Fidelity and Quality in Diffusion-Guided Blind Face Restoration with Dynamic Blur-Level Mapping and Guidance

Huu Phu Do, Yu-Wei Chen, Yi-Cheng Liao et al.

Dyn-HaMR: Recovering 4D Interacting Hand Motion from a Dynamic Camera

Zhengdi Yu, Stefanos Zafeiriou, Tolga Birdal

EBS-EKF: Accurate and High Frequency Event-based Star Tracking

Albert Reed, Connor Hashemi, Dennis Melamed et al.

Edit360: 2D Image Edits to 3D Assets from Any Angle

Junchao Huang, Xinting Hu, Shaoshuai Shi et al.

EDM: Efficient Deep Feature Matching

Xi Li, Tong Rao, Cihui Pan

Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation

Shengfang ZHAI, Jiajun Li, Yue Liu et al.

Efficient Motion-Aware Video MLLM

Zijia Zhao, Yuqi Huo, Tongtian Yue et al.

EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation

Md Mostafijur Rahman, Radu Marculescu

EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision

Yiming Zhao, Taein Kwon, Paul Streli et al.

Electromyography-Informed Facial Expression Reconstruction for Physiological-Based Synthesis and Analysis

Tim Büchner, Christoph Anders, Orlando Guntinas-Lichius et al.

Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions

Tommaso Galliena, Tommaso Apicella, Stefano Rosa et al.

Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding

Yue Fan, Xiaojian Ma, Rongpeng Su et al.

EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing

Gaoxiang Cong, Jiadong Pan, Liang Li et al.

Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility

Yidi Li, Jun Xiao, Zhengda Lu et al.

Emulating Self-attention with Convolution for Efficient Image Super-Resolution

Dongheon Lee, Seokju Yun, Youngmin Ro

End-to-End HOI Reconstruction Transformer with Graph-based Encoding

Zhenrong Wang, Qi Zheng, Sihan Ma et al.

Enduring, Efficient and Robust Trajectory Prediction Attack in Autonomous Driving via Optimization-Driven Multi-Frame Perturbation Framework

Yi Yu, Weizhen Han, Libing Wu et al.

EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space

Jianrong Zhang, Hehe Fan, Yi Yang

Enhanced Visual-Semantic Interaction with Tailored Prompts for Pedestrian Attribute Recognition

Junyi Wu, Yan Huang, Min Gao et al.

Enrich and Detect: Video Temporal Grounding with Multimodal LLMs

Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.

Ensemble Foreground Management for Unsupervised Object Discovery

Ziling Wu, Armaghan Moemeni, Praminda Caleb-Solly

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

ESC: Erasing Space Concept for Knowledge Deletion

Tae-Young Lee, Sundong Park, Minwoo Jeon et al.

ICCV 2025highlightarXiv:2508.10896

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Jongseo Lee, Kyungho Bae, Kyle Min et al.

Estimating Body and Hand Motion in an Ego‑sensed World

Brent Yi, Vickie Ye, Maya Zheng et al.

ETAP: Event-based Tracking of Any Point

Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto et al.

ICCV 2025highlightarXiv:2503.10624

ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness

Boqian Li, Zeyu Cai, Michael Black et al.

Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras

Hoonhee Cho, Jae-Young Kang, Youngho Kim et al.

Evading Data Provenance in Deep Neural Networks

Hongyu Zhu, Sichu Liang, Wenwen Wang et al.

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events

Shuoyan Wei, Feng Li, Shengeng Tang et al.

Event Ellipsometer: Event-based Mueller-Matrix Video Imaging

Ryota Maeda, Yunseong Moon, Seung-Hwan Baek

CVPR 2025highlightarXiv:2412.06191

Event Fields: Capturing Light Fields at High Speed, Resolution, and Dynamic Range

Ziyuan Qu, Zihao Zou, Vivek Boominathan et al.

EventPSR: Surface Normal and Reflectance Estimation from Photometric Stereo Using an Event Camera

Bohan Yu, Jin Han, Boxin Shi et al.

EventUPS: Uncalibrated Photometric Stereo Using an Event Camera

Jinxiu Liang, Bohan Yu, Siqi Yang et al.

ICCV 2025highlightarXiv:2502.06788

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Haiwen Diao, Xiaotong Li, Yufeng Cui et al.

Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation

Hao Zhu, Yan Zhu, Jiayu Xiao et al.

Explaining Human Preferences via Metrics for Structured 3D Reconstruction

Jack Langerman, Denis Rozumny, Yuzhong Huang et al.

Exploring View Consistency for Scene-Adaptive Low-Light Light Field Image Enhancement

Shuo Zhang, Chen Gao, Youfang Lin

Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think

Zhenyi Lu, Xiaoye Qu, Zhenyi Lu et al.

F^3OCUS - Federated Finetuning of Vision-Language Foundation Models with Optimal Client Layer Updating Strategy via Multi-objective Meta-Heuristics

Pramit Saha, Felix Wagner, Divyanshu Mishra et al.

Fast Globally Optimal and Geometrically Consistent 3D Shape Matching

Paul Roetzer, Florian Bernard

F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration

Lu Liu, Huiyu Duan, Qiang Hu et al.

Feature Purification Matters: Suppressing Outlier Propagation for Training-Free Open-Vocabulary Semantic Segmentation

Shuo Jin, Siyue Yu, Bingfeng Zhang et al.

Few-shot Implicit Function Generation via Equivariance

Suizhi Huang, Xingyi Yang, Hongtao Lu et al.

Few-Shot Pattern Detection via Template Matching and Regression

Eunchan Jo, Dahyun Kang, Sanghyun Kim et al.

FIction: 4D Future Interaction Prediction from Video

Kumar Ashutosh, Georgios Pavlakos, Kristen Grauman

Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning

Bardia Safaei, Faizan Siddiqui, Jiacong Xu et al.

CVPR 2025highlightarXiv:2506.11543

FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation

Zhuguanyu Wu, Shihe Wang, Jiayi Zhang et al.