ICCV 2025 Highlight Papers
263 papers found • Page 1 of 6
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li et al.
AAA-Gaussians: Anti-Aliased and Artifact-Free 3D Gaussian Rendering
Michael Steiner, Thomas Köhler, Lukas Radl et al.
A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds
Jizong Peng, Tze Ho Elden Tse, Kai Xu et al.
Adaptive Dual Uncertainty Optimization: Boosting Monocular 3D Object Detection under Test-Time Shifts
Zixuan Hu, Dongxiao Li, Xinzhu Ma et al.
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan, Hanqing Liu, Yao Huang et al.
Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis
Yanzuo Lu, Yuxi Ren, Xin Xia et al.
AIM: Amending Inherent Interpretability via Self-Supervised Masking
Eyad Alshami, Shashank Agnihotri, Bernt Schiele et al.
A Linear N-Point Solver for Structure and Motion from Asynchronous Tracks
Hang Su, Yunlong Feng, Daniel Gehrig et al.
AnimalClue: Recognizing Animals by their Traces
Risa Shinoda, Nakamasa Inoue, Iro Laina et al.
A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions
Youliang Zhang, Ronghui Li, Yachao Zhang et al.
Asynchronous Event Error-Minimizing Noise for Safeguarding Event Dataset
Ruofei WANG, Peiqi Duan, Boxin Shi et al.
ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking
Xiaokun Feng, Shiyu Hu, Xuchen Li et al.
Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder
Wonwoong Cho, Yan-Ying Chen, Matthew Klenk et al.
A Unified Interpretation of Training-Time Out-of-Distribution Detection
Xu Cheng, Xin Jiang, Zechao Li
AutoOcc: Automatic Open-Ended Semantic Occupancy Annotation via Vision-Language Guided Gaussian Splatting
Xiaoyu Zhou, Jingqi Wang, Yongtao Wang et al.
Backdoor Mitigation by Distance-Driven Detoxification
Shaokui Wei, Jiayin Liu, Hongyuan Zha
Benchmarking Egocentric Visual-Inertial SLAM at City Scale
Anusha Krishnan, Shaohui Liu, Paul-Edouard Sarlin et al.
Beyond Losses Reweighting: Empowering Multi-Task Learning via the Generalization Perspective
Hoang Phan, Tung Lam Tran, Quyen Tran et al.
Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection
Taehoon Kim, Jongwook Choi, Yonghyun Jeong et al.
Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search
Shuyu Yang, Yaxiong Wang, Li Zhu et al.
Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval
Dohwan Ko, Ji Soo Lee, Minhyuk Choi et al.
Bi-Level Optimization for Self-Supervised AI-Generated Face Detection
Mian Zou, Nan Zhong, Baosheng Yu et al.
Blended Point Cloud Diffusion for Localized Text-guided Shape Editing
Etai Sella, Noam Atia, Ron Mokady et al.
Bokehlicious: Photorealistic Bokeh Rendering with Controllable Apertures
Tim Seizinger, Florin-Alexandru Vasluianu, Marcos Conde et al.
Boosting Domain Generalized and Adaptive Detection with Diffusion Models: Fitness, Generalization, and Transferability
Boyong He, Yuxiang Ji, Zhuoyue Tan et al.
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
Jiaer Xia, Bingkui Tong, Yuhang Zang et al.
BridgeDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment
Tongfan Guan, Jiaxin Guo, Chen Wang et al.
BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes
Minkyun Seo, Hyungtae Lim, Kanghee Lee et al.
CaptionSmiths: Flexibly Controlling Language Pattern in Image Captioning
Kuniaki Saito, Donghyun Kim, Kwanyong Park et al.
CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance
Peiqi Chen, Lei Yu, Yi Wan et al.
ChartCap: Mitigating Hallucination of Dense Chart Captioning
Junyoung Lim, Jaewoo Ahn, Gunhee Kim
CHROME: Clothed Human Reconstruction with Occlusion-Resilience and Multiview-Consistency from a Single Image
Arindam Dutta, Meng Zheng, Zhongpai Gao et al.
CObL: Toward Zero-Shot Ordinal Layering without User Prompting
Aneel Damaraju, Dean Hazineh, Todd Zickler
CoMatch: Dynamic Covisibility-Aware Transformer for Bilateral Subpixel-Level Semi-Dense Image Matching
Zizhuo Li, Yifan Lu, Linfeng Tang et al.
Combinative Matching for Geometric Shape Assembly
Nahyuk Lee, Juhong Min, Junhong Lee et al.
Confound from All Sides, Distill with Resilience: Multi-Objective Adversarial Paths to Zero-Shot Robustness
Junhao Dong, Jiao Liu, Xinghua Qu et al.
Consensus-Driven Active Model Selection
Justin Kay, Grant Horn, Subhransu Maji et al.
Contact-Aware Amodal Completion for Human-Object Interaction via Multi-Regional Inpainting
Seunggeun Chi, Pin-Hao Huang, Enna Sachdeva et al.
Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation
Tuna Meral, Enis Simsar, Federico Tombari et al.
CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception
Jiaru Zhong, Jiahao Wang, Jiahui Xu et al.
Corvid: Improving Multimodal Large Language Models Towards Chain-of-Thought Reasoning
Jingjing Jiang, Chao Ma, Xurui Song et al.
CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective
Zongheng Tang, Yi Liu, Yifan Sun et al.
CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos
Nikita Karaev, Iurii Makarov, Jianyuan Wang et al.
CounterPC: Counterfactual Feature Realignment for Unsupervised Domain Adaptation on Point Clouds
Feng Yang, Yichao Cao, Xiu Su et al.
CountSE: Soft Exemplar Open-set Object Counting
Shuai Liu, Peng Zhang, Shiwei Zhang et al.
Cross-Architecture Distillation Made Simple with Redundancy Suppression
Weijia Zhang, Yuehao Liu, Wu Ran et al.
CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling
Trong-Thang Pham, AKASH AWASTHI, Saba Khan et al.
DADet: Safeguarding Image Conditional Diffusion Models against Adversarial and Backdoor Attacks via Diffusion Anomaly Detection
Hongwei Yu, Xinlong Ding, Jiawei Li et al.
DAP-MAE: Domain-Adaptive Point Cloud Masked Autoencoder for Effective Cross-Domain Learning
Ziqi Gao, Qiufu Li, Linlin Shen
Dataset Distillation via Vision-Language Category Prototype
YAWEN ZOU, Guang Li, Duo Su et al.