CVPR Papers
5,589 papers found • Page 10 of 112
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.
CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models
Yiqi Zhu, Ziyue Wang, Can Zhang et al.
Co-Speech Gesture Video Generation with Implicit Motion-Audio Entanglement
Xinjie Li, Ziyi Chen, Xinlu Yu et al.
CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI
Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Qingqing Zhao, Yao Lu, Moo Jin Kim et al.
CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model
Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.
COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts
Jiansheng Li, Xingxuan Zhang, Hao Zou et al.
CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology
Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
Henghui Du, Guangyao Li, Chang Zhou et al.
CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
Weiyu Li, Jiarui Liu, Hongyu Yan et al.
Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Hanxi Liu, Yifang Men, Zhouhui Lian
CRISP: Object Pose and Shape Estimation with Test-Time Adaptation
Jingnan Shi, Rajat Talak, Harry Zhang et al.
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Di Zhang, Jingdi Lei, Junxian Li et al.
CroCoDL: Cross-device Collaborative Dataset for Localization
Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
Seung Hyun Lee, Jijun jiang, Yiran Xu et al.
Cross-Modal 3D Representation with Multi-View Images and Point Clouds
Ziyang Zhou, Pinghui Wang, Zi Liang et al.
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.
Cross-modal Causal Relation Alignment for Video Question Grounding
weixing chen, Yang Liu, Binglin Chen et al.
Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion
Saad Lahlali, Sandra Kara, Hejer AMMAR et al.
Cross-modal Information Flow in Multimodal Large Language Models
Zhi Zhang, Srishti Yadav, Fengze Han et al.
Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
Jie Mei, Chenyu Lin, Yu Qiu et al.
CrossOver: 3D Scene Cross-Modal Alignment
Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.
Cross-Rejective Open-Set SAR Image Registration
Shasha Mao, Shiming Lu, Zhaolong Du et al.
CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections
Thomas Walker, Salvatore Esposito, Daniel Rebain et al.
Cross-View Completion Models are Zero-shot Correspondence Estimators
Honggyu An, Jin Hyeon Kim, Seonghoon Park et al.
CryptoFace: End-to-End Encrypted Face Recognition
Wei Ao, Vishnu Naresh Boddeti
CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation
Zhenhui Ding, Guilian Chen, Qin Zhang et al.
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He, Chin-Hsuan Wu, Igor Gilitschenski
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.
Cubify Anything: Scaling Indoor 3D Object Detection
Justin Lazarow, David Griffiths, Gefen Kohavi et al.
Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation
Yanda Chen, Gongwei Chen, Miao Zhang et al.
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.
CustAny: Customizing Anything from A Single Example
Lingjie Kong, Kai WU, Chengming Xu et al.
Customized Condition Controllable Generation for Video Soundtrack
Fan Qi, KunSheng Ma, Changsheng Xu
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
Jungsoo Lee, Debasmit Das, Munawar Hayat et al.
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang, Fuling Wang, Yuehang Li et al.
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia, Mengqi Huang, Nan Chen et al.
D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition.
Haoran Wang, Xinji Mai, Zeng Tao et al.
D^3CTTA: Domain-Dependent Decorrelation for Continual Test-Time Adaption of 3D LiDAR Segmentation
Jichun Zhao, Haiyong Jiang, Haoxuan Song et al.
D^3-Human: Dynamic Disentangled Digital Human from Monocular Video
Honghu Chen, Bo Peng, Yunfan Tao et al.
D^3: Scaling Up Deepfake Detection by Learning from Discrepancy
Yongqi Yang, Zhihao Qian, Ye Zhu et al.
DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing
Yufei Huang, Bangyan Liao, Yuqi Hu et al.
DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh
Jingyu Zhuang, Di Kang, Linchao Bao et al.
DAMM-Diffusion: Learning Divergence-Aware Multi-Modal Diffusion Model for Nanoparticles Distribution Prediction
Junjie Zhou, Shouju Wang, Yuxia Tang et al.
DarkIR: Robust Low-Light Image Restoration
Daniel Feijoo, Juan C. Benito, Alvaro Garcia et al.
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.
DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds
Youyu Chen, Junjun Jiang, Kui Jiang et al.
Data Distributional Properties As Inductive Bias for Systematic Generalization
Felipe del Rio, Alain Raymond, Daniel Florea et al.
Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales
Shuokai Pan, Gerti Tuzi, Sudarshan Sreeram et al.
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee, Yeonghwan Song, Jeany Son