CVPR Papers

5,589 papers found • Page 10 of 112

COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training

Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.

CVPR 2025posterarXiv:2412.01814
7
citations

CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models

Yiqi Zhu, Ziyue Wang, Can Zhang et al.

CVPR 2025posterarXiv:2503.14161
3
citations

Co-Speech Gesture Video Generation with Implicit Motion-Audio Entanglement

Xinjie Li, Ziyi Chen, Xinlu Yu et al.

CVPR 2025poster

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.

CVPR 2025posterarXiv:2503.18286
14
citations

CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models

Qingqing Zhao, Yao Lu, Moo Jin Kim et al.

CVPR 2025poster
203
citations

CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model

Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.

CVPR 2025poster

COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts

Jiansheng Li, Xingxuan Zhang, Hao Zou et al.

CVPR 2025highlightarXiv:2504.10158
1
citations

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.

CVPR 2025posterarXiv:2412.12077
22
citations

Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Henghui Du, Guangyao Li, Chang Zhou et al.

CVPR 2025poster

CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner

Weiyu Li, Jiarui Liu, Hongyu Yan et al.

CVPR 2025poster

Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting

Hanxi Liu, Yifang Men, Zhouhui Lian

CVPR 2025highlightarXiv:2504.20403
1
citations

CRISP: Object Pose and Shape Estimation with Test-Time Adaptation

Jingnan Shi, Rajat Talak, Harry Zhang et al.

CVPR 2025highlight

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning

Di Zhang, Jingdi Lei, Junxian Li et al.

CVPR 2025posterarXiv:2411.18203
30
citations

CroCoDL: Cross-device Collaborative Dataset for Localization

Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.

CVPR 2025poster
1
citations

Cropper: Vision-Language Model for Image Cropping through In-Context Learning

Seung Hyun Lee, Jijun jiang, Yiran Xu et al.

CVPR 2025posterarXiv:2408.07790
5
citations

Cross-Modal 3D Representation with Multi-View Images and Point Clouds

Ziyang Zhou, Pinghui Wang, Zi Liang et al.

CVPR 2025poster

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding

Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.

CVPR 2025posterarXiv:2503.16707
7
citations

Cross-modal Causal Relation Alignment for Video Question Grounding

weixing chen, Yang Liu, Binglin Chen et al.

CVPR 2025highlightarXiv:2503.07635
7
citations

Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion

Saad Lahlali, Sandra Kara, Hejer AMMAR et al.

CVPR 2025posterarXiv:2503.15022

Cross-modal Information Flow in Multimodal Large Language Models

Zhi Zhang, Srishti Yadav, Fengze Han et al.

CVPR 2025posterarXiv:2411.18620

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei, Chenyu Lin, Yu Qiu et al.

CVPR 2025poster

CrossOver: 3D Scene Cross-Modal Alignment

Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.

CVPR 2025highlightarXiv:2502.15011
7
citations

Cross-Rejective Open-Set SAR Image Registration

Shasha Mao, Shiming Lu, Zhaolong Du et al.

CVPR 2025poster

CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections

Thomas Walker, Salvatore Esposito, Daniel Rebain et al.

CVPR 2025posterarXiv:2412.04120

Cross-View Completion Models are Zero-shot Correspondence Estimators

Honggyu An, Jin Hyeon Kim, Seonghoon Park et al.

CVPR 2025highlight

CryptoFace: End-to-End Encrypted Face Recognition

Wei Ao, Vishnu Naresh Boddeti

CVPR 2025posterarXiv:2509.00332

CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation

Zhenhui Ding, Guilian Chen, Qin Zhang et al.

CVPR 2025poster
1
citations

CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

Kai He, Chin-Hsuan Wu, Igor Gilitschenski

CVPR 2025posterarXiv:2412.01792
5
citations

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.

CVPR 2025poster

Cubify Anything: Scaling Indoor 3D Object Detection

Justin Lazarow, David Griffiths, Gefen Kohavi et al.

CVPR 2025highlightarXiv:2412.04458
18
citations

Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation

Yanda Chen, Gongwei Chen, Miao Zhang et al.

CVPR 2025poster

Curriculum Direct Preference Optimization for Diffusion and Consistency Models

Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.

CVPR 2025posterarXiv:2405.13637
21
citations

CustAny: Customizing Anything from A Single Example

Lingjie Kong, Kai WU, Chengming Xu et al.

CVPR 2025posterarXiv:2406.11643
2
citations

Customized Condition Controllable Generation for Video Soundtrack

Fan Qi, KunSheng Ma, Changsheng Xu

CVPR 2025poster
1
citations

CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation

Jungsoo Lee, Debasmit Das, Munawar Hayat et al.

CVPR 2025posterarXiv:2503.18244
3
citations

CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset

Xiao Wang, Fuling Wang, Yuehang Li et al.

CVPR 2025posterarXiv:2410.00379
16
citations

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Weinan Jia, Mengqi Huang, Nan Chen et al.

CVPR 2025poster
6
citations

D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition.

Haoran Wang, Xinji Mai, Zeng Tao et al.

CVPR 2025posterarXiv:2406.16473

D^3CTTA: Domain-Dependent Decorrelation for Continual Test-Time Adaption of 3D LiDAR Segmentation

Jichun Zhao, Haiyong Jiang, Haoxuan Song et al.

CVPR 2025poster

D^3-Human: Dynamic Disentangled Digital Human from Monocular Video

Honghu Chen, Bo Peng, Yunfan Tao et al.

CVPR 2025posterarXiv:2501.01589
5
citations

D^3: Scaling Up Deepfake Detection by Learning from Discrepancy

Yongqi Yang, Zhihao Qian, Ye Zhu et al.

CVPR 2025posterarXiv:2404.04584
19
citations

DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing

Yufei Huang, Bangyan Liao, Yuqi Hu et al.

CVPR 2025poster
4
citations

DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh

Jingyu Zhuang, Di Kang, Linchao Bao et al.

CVPR 2025poster

DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction

Junjie Zhou, Shouju Wang, Yuxia Tang et al.

CVPR 2025highlightarXiv:2503.09491
1
citations

DarkIR: Robust Low-Light Image Restoration

Daniel Feijoo, Juan C. Benito, Alvaro Garcia et al.

CVPR 2025posterarXiv:2412.13443

DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation

Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.

CVPR 2025posterarXiv:2504.11786
1
citations

DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds

Youyu Chen, Junjun Jiang, Kui Jiang et al.

CVPR 2025highlightarXiv:2503.18402
16
citations

Data Distributional Properties As Inductive Bias for Systematic Generalization

Felipe del Rio, Alain Raymond, Daniel Florea et al.

CVPR 2025posterarXiv:2502.20499
1
citations

Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales

Shuokai Pan, Gerti Tuzi, Sudarshan Sreeram et al.

CVPR 2025posterarXiv:2412.19867

Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior

Chanhui Lee, Yeonghwan Song, Jeany Son

CVPR 2025posterarXiv:2502.21048
1
citations