CVPR Highlight Papers
712 papers found • Page 9 of 15
Amodal Completion via Progressive Mixed Context Diffusion
Katherine Xu, Lingzhi Zhang, Jianbo Shi
A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification
Zexian Yang, Dayan Wu, Chenming Wu et al.
Are Conventional SNNs Really Efficient? A Perspective from Network Quantization
Guobin Shen, Dongcheng Zhao, Tenglong Li et al.
Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion
Fan Zhang, Shaodi You, Yu Li et al.
Attention-Propagation Network for Egocentric Heatmap to 3D Pose Lifting
Taeho Kang, Youngki Lee
AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation
Jeongsoo Choi, Se Jin Park, Minsu Kim et al.
A Vision Check-up for Language Models
Pratyusha Sharma, Tamar Rott Shaham, Manel Baradad et al.
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning
Siyuan Liang, Mingli Zhu, Aishan Liu et al.
Bayes' Rays: Uncertainty Quantification for Neural Radiance Fields
Leili Goli, Cody Reading, Silvia Sellán et al.
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
Yunhao Ge, Yihe Tang, Jiashu Xu et al.
Boosting Neural Representations for Videos with a Conditional Decoder
XINJIE ZHANG, Ren Yang, Dailan He et al.
Brain Decodes Deep Nets
Huzheng Yang, James Gee, Jianbo Shi
Breathing Life Into Sketches Using Text-to-Video Priors
Rinon Gal, Yael Vinker, Yuval Alaluf et al.
C2KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation
Fushuo Huo, Wenchao Xu, Jingcai Guo et al.
CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention
Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali et al.
CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs
Haocheng Yuan, Jing Xu, Hao Pan et al.
Can I Trust Your Answer? Visually Grounded Video Question Answering
Junbin Xiao, Angela Yao, Yicong Li et al.
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
Seokju Cho, Heeseong Shin, Sunghwan Hong et al.
Causal-CoG: A Causal-Effect Look at Context Generation for Boosting Multi-modal Language Models
Shitian Zhao, Zhuowan Li, YadongLu et al.
CFAT: Unleashing Triangular Windows for Image Super-resolution
Abhisek Ray, Gaurav Kumar, Maheshkumar Kolekar
CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing
Ajian Liu, Shuai Xue, Gan Jianwen et al.
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Peng Jin, Ryuichi Takanobu, Cai Zhang et al.
CLiC: Concept Learning in Context
Mehdi Safaee, Aryan Mikaeili, Or Patashnik et al.
CLIP-Driven Open-Vocabulary 3D Scene Graph Generation via Cross-Modality Contrastive Learning
Lianggangxu Chen, Xuejiao Wang, Jiale Lu et al.
Clockwork Diffusion: Efficient Generation With Model-Step Distillation
Amirhossein Habibian, Amir Ghodrati, Noor Fathima et al.
Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis
Yanzuo Lu, Manlin Zhang, Jinhua Ma et al.
CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
Hao Ouyang, Qiuyu Wang, Yuxi Xiao et al.
CoDi-2: In-Context Interleaved and Interactive Any-to-Any Generation
Zineng Tang, Ziyi Yang, MAHMOUD KHADEMI et al.
CogAgent: A Visual Language Model for GUI Agents
Wenyi Hong, Weihan Wang, Qingsong Lv et al.
Coherence As Texture – Passive Textureless 3D Reconstruction by Self-interference
Wei-Yu Chen, Aswin C. Sankaranarayanan, Anat Levin et al.
COLMAP-Free 3D Gaussian Splatting
Yang Fu, Sifei Liu, Amey Kulkarni et al.
Compact 3D Gaussian Representation for Radiance Field
Joo Chan Lee, Daniel Rho, Xiangyu Sun et al.
Continual Self-supervised Learning: Towards Universal Multi-modal Medical Data Representation Learning
Yiwen Ye, Yutong Xie, Jianpeng Zhang et al.
Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
Qi Yang, Xing Nie, Tong Li et al.
CoralSCOP: Segment any COral Image on this Planet
Zheng Ziqiang, Liang Haixin, Binh-Son Hua et al.
Correcting Diffusion Generation through Resampling
Yujian Liu, Yang Zhang, Tommi Jaakkola et al.
Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis
Mingyang Zhao, Jiang Jingen, Lei Ma et al.
CosmicMan: A Text-to-Image Foundation Model for Humans
Shikai Li, Jianglin Fu, Kaiyuan Liu et al.
Cross-spectral Gated-RGB Stereo Depth Estimation
Samuel Brucker, Stefanie Walz, Mario Bijelic et al.
Data-Efficient Multimodal Fusion on a Single GPU
Noël Vouitsis, Zhaoyan Liu, Satya Krishna Gorti et al.
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
Tianhao Qi, Shancheng Fang, Yanze Wu et al.
Deciphering ‘What’ and ‘Where’ Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang, David Yunis, Michael Maire
Dense Optical Tracking: Connecting the Dots
Guillaume Le Moing, Jean Ponce, Cordelia Schmid
Detours for Navigating Instructional Videos
Kumar Ashutosh, Zihui Xue, Tushar Nagarajan et al.
Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation
Dong Lao, Congli Wang, Alex Wong et al.
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis
Yuming Gu, Hongyi Xu, You Xie et al.
Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation
Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.
Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D
Karran Pandey, Paul Guerrero, Matheus Gadelha et al.
Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance
Yuto Enyo, Ko Nishino
Discovering and Mitigating Visual Biases through Keyword Explanation
Younghyun Kim, Sangwoo Mo, Minkyu Kim et al.