ICCV 2025 Papers
2,701 papers found • Page 3 of 55
Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning
Junming Liu, Siyuan Meng, Yanting Gao et al.
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
Congyi Fan, Jian Guan, Xuanjia Zhao et al.
A Linear N-Point Solver for Structure and Motion from Asynchronous Tracks
Hang Su, Yunlong Feng, Daniel Gehrig et al.
Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation
Shuchang Ye, Usman Naseem, Mingyuan Meng et al.
AllGCD: Leveraging All Unlabeled Data for Generalized Category Discovery
Xinzi Cao, Ke Chen, Feidiao Yang et al.
All in One: Visual-Description-Guided Unified Point Cloud Segmentation
Zongyan Han, Mohamed El Amine Boudjoghra, Jiahua Dong et al.
Allowing Oscillation Quantization: Overcoming Solution Space Limitation in Low Bit-Width Quantization
Weiying Xie, Zihan Meng, Jitao Ma et al.
All Parts Matter: A Unified Mask-Free Virtual Try-On Framework
Chenghu Du, Shengwu Xiong, Yi Rong
AllTracker: Efficient Dense Point Tracking at High Resolution
Adam Harley, Yang You, Yang Zheng et al.
ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Predictions
Dubing Chen, Jin Fang, Wencheng Han et al.
Always Skip Attention
Yiping Ji, Hemanth Saratchandran, Peyman Moghadam et al.
AM-Adapter: Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis in-the-Wild
Siyoon Jin, Jisu Nam, Jiyoung Kim et al.
AMD: Adaptive Momentum and Decoupled Contrastive Learning Framework for Robust Long-Tail Trajectory Prediction
Bin Rao, Haicheng Liao, Yanchen Guan et al.
AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation
Haifeng Zhong, Fan Tang, Zhuo Chen et al.
Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images
Tianhao Wu, Chuanxia Zheng, Frank Guan et al.
Amodal Depth Anything: Amodal Depth Estimation in the Wild
Zhenyu Li, Mykola Lavreniuk, Jian Shi et al.
Analyzing Finetuning Representation Shift for Multimodal LLMs Steering
Pegah KHAYATAN, Mustafa Shukor, Jayneel Parekh et al.
Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing
Taihang Hu, Linxuan Li, Kai Wang et al.
An Efficient Hybrid Vision Transformer for TinyML Applications
Fanhong Zeng, Huanan LI, Juntao Guan et al.
An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval
Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim et al.
An Empirical Study of Autoregressive Pre-training from Videos
Jathushan Rajasegaran, Ilija Radosavovic, Rahul Ravishankar et al.
AnimalClue: Recognizing Animals by their Traces
Risa Shinoda, Nakamasa Inoue, Iro Laina et al.
AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation
zijie wu, Chaohui Yu, Fan Wang et al.
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance
Li Hu, wang yuan, Zhen Shen et al.
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction
Junhao Cheng, Yuying Ge, Yixiao Ge et al.
An Information-Theoretic Regularizer for Lossy Neural Image Compression
ZHANG YINGWEN, Meng Wang, Xihua Sheng et al.
An Inversion-based Measure of Memorization for Diffusion Models
Zhe Ma, Qingming Li, Xuhong Zhang et al.
AnnofreeOD: Detecting All Classes at Low Frame Rates Without Human Annotations
Boyi Sun, Yuhang Liu, Houxin He et al.
Anomaly Detection of Integrated Circuits Package Substrates Using the Large Vision Model SAIC: Dataset Construction, Methodology, and Application
Ruiyun Yu, Bingyang Guo, Haoyuan Li
An OpenMind for 3D Medical Vision Self-supervised Learning
Tassilo Wald, Constantin Ulrich, Jonathan Suprijadi et al.
Anti-Tamper Protection for Unauthorized Individual Image Generation
Zelin Li, Ruohan Zong, Yifan Liu et al.
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
Hailong Guo, Bohan Zeng, Yiren Song et al.
AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation
Guanxing Lu, Tengbo Yu, Haoyuan Deng et al.
AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration
Javier Tirado-Garín, Javier Civera
AnyI2V: Animating Any Conditional Image with Motion Control
Ziye Li, Xincheng Shuai, Hao Luo et al.
AnyPortal: Zero-Shot Consistent Video Background Replacement
Wenshuo Gao, Xicheng Lan, Shuai Yang
Any-SSR: How Recursive Least Squares Works in Continual Learning of Large Language Model
Kai Tong, Kang Pan, Xiao Zhang et al.
A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions
Youliang Zhang, Ronghui Li, Yachao Zhang et al.
A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition
Jie Zhu, Yiyang Su, Minchul Kim et al.
AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction
Xuying Zhang, Yupeng Zhou, Kai Wang et al.
ArchiSet: Benchmarking Editable and Consistent Single-View 3D Reconstruction of Buildings with Specific Window-to-Wall Ratios
Jun Yin, Pengyu Zeng, Licheng Shen et al.
A Real-world Display Inverse Rendering Dataset
Seokjun Choi, Hoon-Gyu Chung, Yujin Jeon et al.
A Recipe for Generating 3D Worlds from a Single Image
Katja Schwarz, Denis Rozumny, Samuel Rota Bulò et al.
Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
Yikang Zhou, Tao Zhang, Shilin Xu et al.
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives
Shaoyuan Xie, Lingdong Kong, Yuhao Dong et al.
ArgMatch: Adaptive Refinement Gathering for Efficient Dense Matching
Yuxin Deng, Kaining Zhang, Linfeng Tang et al.
ArgoTweak: Towards Self-Updating HD Maps through Structured Priors
Lena Wild, Rafael Valencia, Patric Jensfelt
ARGUS: Hallucination and Omission Evaluation in Video-LLMs
Ruchit Rawal, Reza Shirkavand, Heng Huang et al.
ARIG: Autoregressive Interactive Head Generation for Real-time Conversations
Ying Guo, Xi Liu, Cheng Zhen et al.
ARMO: Autoregressive Rigging for Multi-Category Objects
mingze sun, Shiwei Mao, Keyi Chen et al.