CVPR Papers
5,589 papers found • Page 7 of 112
Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis
Hanbin Ko, Chang Min Park
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Zeyi Huang, Yuyang Ji, Xiaofang Wang et al.
Building Vision Models upon Heat Conduction
Zhaozhi Wang, Yue Liu, Yunjie Tian et al.
BWFormer: Building Wireframe Reconstruction from Airborne LiDAR Point Cloud with Transformer
Yuzhou Liu, Lingjie Zhu, Hanqiao Ye et al.
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu, Pengyang Ling, Pan Zhang et al.
CacheQuant: Comprehensively Accelerated Diffusion Models
Xuewen Liu, Zhikai Li, Qingyi Gu
CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images
Chen Cheng, Jiacheng Wei, Tianrun Chen et al.
CADDreamer: CAD Object Generation from Single-view Images
Yuan Li, Cheng Lin, Yuan Liu et al.
CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation
Jiahao Li, Weijian Ma, Xueyang Li et al.
CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging
Zhiwei Ling, Yachen Chang, Hailiang Zhao et al.
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
Kyungmin Lee, Xiaohang Li, Qifei Wang et al.
CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models
Kiet A. Nguyen, Adheesh Juvekar, Tianjiao Yu et al.
Camera Resection from Known Line Pencils and a Radially Distorted Scanline
Juan Carlos Dibene Simental, Enrique Dunn
CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Xiaoding Yuan, Shitao Tang, Kejie Li et al.
Camouflage Anything: Learning to Hide using Controlled Out-painting and Representation Engineering
Biplab Das, Viswanath Gopalakrishnan
CamPoint: Boosting Point Cloud Segmentation with Virtual Camera
Jianhui Zhang, Luo Yizhi, Zicheng Zhang et al.
CaMuViD: Calibration-Free Multi-View Detection
Amir Etefaghi Daryani, M. Usman Maqbool Bhutta, Byron Hernandez et al.
Can Generative Video Models Help Pose Estimation?
Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.
Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves?
Yuan-Hong Liao, Rafid Mahmood, Sanja Fidler et al.
Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding
Zhaoran Zhao, Peng Lu, Anran Zhang et al.
Can Text-to-Video Generation help Video-Language Alignment?
Luca Zanella, Massimiliano Mancini, Willi Menapace et al.
Can't Slow Me Down: Learning Robust and Hardware-Adaptive Object Detectors against Latency Attacks for Edge Devices
Tianyi Wang, Zichen Wang, Cong Wang et al.
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models
Felix Taubner, Ruihang Zhang, Mathieu Tuli et al.
CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image
Jingshun Huang, Haitao Lin, Tianyu Wang et al.
CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction
Yuan Zhou, Qingshan Xu, Jiequan Cui et al.
CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth
Zhiyu Qu, Yunqi Miao, Zhensong Zhang et al.
CARL: A Framework for Equivariant Image Registration
Hastings Greer, Lin Tian, François-Xavier Vialard et al.
CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving
Dongkun Zhang, Jiaming Liang, Ke Guo et al.
CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design
Weitao Feng, Hang Zhou, Jing Liao et al.
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.
CASP: Consistency-aware Audio-induced Saliency Prediction Model for Omnidirectional Video
Zhaolin Wan, Han Qin, Zhiyang Li et al.
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
Rundi Wu, Ruiqi Gao, Ben Poole et al.
CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution
Xin Liu, Jie Liu, Jie Tang et al.
Category-Agnostic Neural Object Rigging
Guangzhao He, Chen Geng, Shangzhe Wu et al.
Causal Composition Diffusion Model for Closed-loop Traffic Generation
Haohong Lin, Xin Huang, Tung Phan-Minh et al.
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
Edson Araujo, Andrew Rouditchenko, Yuan Gong et al.
CCIN: Compositional Conflict Identification and Neutralization for Composed Image Retrieval
Likai Tian, Jian Zhao, Zechao Hu et al.
CDI: Copyrighted Data Identification in Diffusion Models
Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.
Certified Human Trajectory Prediction
Mohammadhossein Bahari, Saeed Saadatnejad, Amirhossein Askari Farsangi et al.
CGMatch: A Different Perspective of Semi-supervised Learning
Bo Cheng, Jueqing Lu, Yuan Tian et al.
CH3Depth: Efficient and Flexible Depth Foundation Model with Flow Matching
Jiaqi Li, Yiran Wang, Jinghong Zheng et al.
ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation
Ling-An Zeng, Guohong Huang, Yi-Lin Wei et al.
Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks
Peng Xie, Yequan Bie, Jianda Mao et al.
Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding
Jiaxin Shi, Mingyue Xiang, Hao Sun et al.
Change3D: Revisiting Change Detection and Captioning from A Video Modeling Perspective
Duowang Zhu, Xiaohu Huang, Haiyan Huang et al.
Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining
Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.
Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes
JunYong Choi, Min-Cheol Sagong, SeokYeong Lee et al.
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Lucas Ventura, Antoine Yang, Cordelia Schmid et al.
Charm: The Missing Piece in ViT Fine-Tuning for Image Aesthetic Assessment
Fatemeh Behrad, Tinne Tuytelaars, Johan Wagemans