CVPR Poster Papers
4,874 papers found • Page 9 of 98
CoSDH: Communication-Efficient Collaborative Perception via Supply-Demand Awareness and Intermediate-Late Hybridization
Junhao Xu, Yanan Zhang, Zhi Cai et al.
COSMIC: Clique-Oriented Semantic Multi-space Integration for Robust CLIP Test-Time Adaptation
Fanding Huang, Jingyan Jiang, Qinting Jiang et al.
COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training
Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.
CoSpace: Benchmarking Continuous Space Perception Ability for Vision-Language Models
Yiqi Zhu, Ziyue Wang, Can Zhang et al.
Co-Speech Gesture Video Generation with Implicit Motion-Audio Entanglement
Xinjie Li, Ziyi Chen, Xinlu Yu et al.
CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI
Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Qingqing Zhao, Yao Lu, Moo Jin Kim et al.
CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model
Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.
CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology
Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.
Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation
Henghui Du, Guangyao Li, Chang Zhou et al.
CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner
Weiyu Li, Jiarui Liu, Hongyu Yan et al.
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Di Zhang, Jingdi Lei, Junxian Li et al.
CroCoDL: Cross-device Collaborative Dataset for Localization
Hermann Blum, Alessandro Mercurio, Joshua O'Reilly et al.
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
Seung Hyun Lee, Jijun jiang, Yiran Xu et al.
Cross-Modal 3D Representation with Multi-View Images and Point Clouds
Ziyang Zhou, Pinghui Wang, Zi Liang et al.
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.
Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D Motion
Saad Lahlali, Sandra Kara, Hejer AMMAR et al.
Cross-modal Information Flow in Multimodal Large Language Models
Zhi Zhang, Srishti Yadav, Fengze Han et al.
Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images
Jie Mei, Chenyu Lin, Yu Qiu et al.
Cross-Rejective Open-Set SAR Image Registration
Shasha Mao, Shiming Lu, Zhaolong Du et al.
CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections
Thomas Walker, Salvatore Esposito, Daniel Rebain et al.
CryptoFace: End-to-End Encrypted Face Recognition
Wei Ao, Vishnu Naresh Boddeti
CSC-PA: Cross-image Semantic Correlation via Prototype Attentions for Single-network Semi-supervised Breast Tumor Segmentation
Zhenhui Ding, Guilian Chen, Qin Zhang et al.
CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion
Kai He, Chin-Hsuan Wu, Igor Gilitschenski
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning
Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.
Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation
Yanda Chen, Gongwei Chen, Miao Zhang et al.
Curriculum Direct Preference Optimization for Diffusion and Consistency Models
Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.
CustAny: Customizing Anything from A Single Example
Lingjie Kong, Kai WU, Chengming Xu et al.
Customized Condition Controllable Generation for Video Soundtrack
Fan Qi, KunSheng Ma, Changsheng Xu
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
Jungsoo Lee, Debasmit Das, Munawar Hayat et al.
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang, Fuling Wang, Yuehang Li et al.
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
Weinan Jia, Mengqi Huang, Nan Chen et al.
D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition.
Haoran Wang, Xinji Mai, Zeng Tao et al.
D^3CTTA: Domain-Dependent Decorrelation for Continual Test-Time Adaption of 3D LiDAR Segmentation
Jichun Zhao, Haiyong Jiang, Haoxuan Song et al.
D^3-Human: Dynamic Disentangled Digital Human from Monocular Video
Honghu Chen, Bo Peng, Yunfan Tao et al.
D^3: Scaling Up Deepfake Detection by Learning from Discrepancy
Yongqi Yang, Zhihao Qian, Ye Zhu et al.
DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing
Yufei Huang, Bangyan Liao, Yuqi Hu et al.
DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh
Jingyu Zhuang, Di Kang, Linchao Bao et al.
DarkIR: Robust Low-Light Image Restoration
Daniel Feijoo, Juan C. Benito, Alvaro Garcia et al.
DART: Disease-aware Image-Text Alignment and Self-correcting Re-alignment for Trustworthy Radiology Report Generation
Sang-Jun Park, Keun-Soo Heo, Dong-Hee Shin et al.
Data Distributional Properties As Inductive Bias for Systematic Generalization
Felipe del Rio, Alain Raymond, Daniel Florea et al.
Data-Free Group-Wise Fully Quantized Winograd Convolution via Learnable Scales
Shuokai Pan, Gerti Tuzi, Sudarshan Sreeram et al.
Data-free Universal Adversarial Perturbation with Pseudo-semantic Prior
Chanhui Lee, Yeonghwan Song, Jeany Son
Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided Diffusion
Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.
DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers
Li Ren, Chen Chen, Liqiang Wang et al.
DCEvo: Discriminative Cross-Dimensional Evolutionary Learning for Infrared and Visible Image Fusion
Jinyuan Liu, Bowei Zhang, Qingyun Mei et al.
De^2Gaze: Deformable and Decoupled Representation Learning for 3D Gaze Estimation
Yunfeng Xiao, Xiaowei Bai, Baojun Chen et al.
DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging
Zhu Liu, Zijun Wang, Jinyuan Liu et al.
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
zefeng zhang, Hengzhu Tang, Jiawei Sheng et al.
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos
Zijia Lu, ASM Iftekhar, Gaurav Mittal et al.