ICCV 2025 Papers
2,701 papers found • Page 49 of 55
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
Xiangdong Zhang, Shaofeng Zhang, Junchi Yan
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Kaining Ying, Henghui Ding, Guangquan Jie et al.
Towards Open-World Generation of Stereo Images and Unsupervised Matching
Feng Qiao, Zhexiao Xiong, Eric Xing et al.
Towards Performance Consistency in Multi-Level Model Collaboration
Qi Li, Runpeng Yu, Xinchao Wang
Towards Privacy-preserved Pre-training of Remote Sensing Foundation Models with Federated Mutual-guidance Learning
Jieyi Tan, Chengwei Zhang, Bo Dang et al.
Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning
Muhammad Aqeel, Shakiba Sharifi, Marco Cristani et al.
Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification
Wenkui Yang, Jie Cao, Junxian Duan et al.
Towards Robustness of Person Search against Corruptions
Woojung Son, Yoonki Cho, Guoyuan An et al.
Towards Safer and Understandable Driver Intention Prediction
Mukilan Karuppasamy, Shankar Gangisetty, Shyam Nandan Rai et al.
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
Xingyu Miao, Haoran Duan, Quanhao Qian et al.
Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints
Guanjie Chen, Xinyu Zhao, Yucheng Zhou et al.
Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding
Yuanhan Zhang, Yunice Chew, Yuhao Dong et al.
Towards Visual Localization Interoperability: Cross-Feature for Collaborative Visual Localization and Mapping
Alberto Jaenal, Paula Carbó Cubero, Jose Araujo et al.
TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging
QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang
Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing
Hongyu Shen, Junfeng Ni, Weishuo Li et al.
TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos
Jinxi Li, Ziyang Song, Bo Yang
Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection
Yichen Lu, Siwei Nie, Minlong Lu et al.
TrackAny3D: Transferring Pretrained 3D Models for Category-unified 3D Point Cloud Tracking
Mengmeng Wang, Haonan Wang, Yulong Li et al.
Tracking Tiny Drones against Clutter: Large-Scale Infrared Benchmark with Motion-Centric Adaptive Algorithm
Jiahao Zhang, Zongli Jiang, Gang Wang et al.
TrackVerse: A Large-Scale Object-Centric Video Dataset for Image-Level Representation Learning
Yibing Wei, Samuel Church, Victor Suciu et al.
Trade-offs in Image Generation: How Do Different Dimensions Interact?
Sicheng Zhang, Binzhu Xie, Zhonghao Yan et al.
TrafficLoc: Localizing Traffic Surveillance Cameras in 3D Scenes
Yan Xia, Yunxiang Lu, Rui Song et al.
Training-free and Adaptive Sparse Attention for Efficient Long Video Generation
yifei xia, Suhan Ling, Fangcheng Fu et al.
Training-Free Class Purification for Open-Vocabulary Semantic Segmentation
Qi Chen, Lingxiao Yang, Yun Chen et al.
Training-Free Generation of Temporally Consistent Rewards from VLMs
Yinuo Zhao, Jiale Yuan, Zhiyuan Xu et al.
Training-free Geometric Image Editing on Diffusion Models
Hanshen Zhu, Zhen Zhu, Kaile Zhang et al.
Training-Free Industrial Defect Generation with Diffusion Models
Ruyi Xu, Yen-Tzu Chiu, Tai-I Chen et al.
Training-Free Personalization via Retrieval and Reasoning on Fingerprints
Deepayan Das, Davide Talon, Yiming Wang et al.
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
Yufei Wang, Lanqing Guo, Zhihao Li et al.
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Mark YU, Wenbo Hu, Jinbo Xing et al.
Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting
Yuekun Dai, Haitian Li, Shangchen Zhou et al.
Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models
Zerui Tao, Yuhta Takida, Naoki Murata et al.
Transformer-based Tooth Alignment Prediction with Occlusion and Collision Constraints
DongZhenXing DongZhenXing, Jiazhou Chen
TransiT: Transient Transformer for Non-line-of-sight Videography
Ruiqian Li, Siyuan Shen, Suan Xia et al.
Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models
Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.
Transparent Vision: A Theory of Hierarchical Invariant Representations
Shuren Qi, Yushu Zhang, CHAO WANG et al.
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
Ruidong Chen, honglin guo, Lanjun Wang et al.
TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training
Felix Krause, Timy Phan, Ming Gui et al.
Tree-NeRV: Efficient Non-Uniform Sampling for Neural Video Representation via Tree-Structured Feature Grids
Jiancheng Zhao, Yifan Zhan, Qingtian Zhu et al.
Tree Skeletonization from 3D Point Clouds by Denoising Diffusion
Elias Marks, Lucas Nunes, Federico Magistri et al.
Triad: Empowering LMM-based Anomaly Detection with Expert-guided Region-of-Interest Tokenizer and Manufacturing Process
Yuanze Li, Shihao Yuan, Haolin Wang et al.
Trial-Oriented Visual Rearrangement
Yuyi Liu, Xinhang Song, Tianliang Qi et al.
TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions
Ilya A. Petrov, Riccardo Marin, Julian Chibane et al.
TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring
Zhu Xu, Ting Lei, Zhimin Li et al.
TRNAS: A Training-Free Robust Neural Architecture Search
Yeming Yang, Qingling Zhu, Jianping Luo et al.
Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition
Pulkit Kumar, Shuaiyi Huang, Matthew Walmer et al.
TR-PTS: Task-Relevant Parameter and Token Selection for Efficient Tuning
Siqi Luo, Haoran Yang, Yi Xin et al.
Trust but Verify: Programmatic VLM Evaluation in the Wild
Viraj Prabhu, Senthil Purushwalkam, An Yan et al.
TrustMark: Robust Watermarking and Watermark Removal for Arbitrary Resolution Images
Tu Bui, Shruti Agarwal, John Collomosse
TruthPrInt: Mitigating Large Vision-Language Models Object Hallucination Via Latent Truthful-Guided Pre-Intervention
Jinhao Duan, Fei Kong, Hao Cheng et al.