CVPR Papers

5,589 papers found • Page 111 of 112

Visual Concept Connectome (VCC): Open World Concept Discovery and their Interlayer Connections in Deep Models

Matthew Kowal, Richard P. Wildes, Kosta Derpanis

CVPR 2024highlight
16
citations

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

Young Kyun Jang, Donghyun Kim, Zihang Meng et al.

CVPR 2024poster
21
citations

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.

CVPR 2024poster
33
citations

Visual In-Context Prompting

Feng Li, Qing Jiang, Hao Zhang et al.

CVPR 2024poster
52
citations

Visual Layout Composer: Image-Vector Dual Diffusion Model for Design Layout Generation

Mohammad Amin Shabani, Zhaowen Wang, Difan Liu et al.

CVPR 2024poster

Visual Objectification in Films: Towards a New AI Task for Video Interpretation

Julie Tores, Lucile Sassatelli, Hui-Yin Wu et al.

CVPR 2024highlight
5
citations

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

Zetong Yang, Li Chen, Yanan Sun et al.

CVPR 2024highlight

Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models

Yushi Hu, Otilia Stretcu, Chun-Ta Lu et al.

CVPR 2024poster

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024poster

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal et al.

CVPR 2024poster
21
citations

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

Jieneng Chen, Qihang Yu, Xiaohui Shen et al.

CVPR 2024poster

ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions

Chunlong Xia, Xinliang Wang, Feng Lv et al.

CVPR 2024highlight
131
citations

ViT-Lens: Towards Omni-modal Representations

Stan Weixian Lei, Yixiao Ge, Kun Yi et al.

CVPR 2024poster

ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models

Jeong-gi Kwak, Erqun Dong, Yuhe Jin et al.

CVPR 2024highlight

VkD: Improving Knowledge Distillation using Orthogonal Projections

Roy Miles, Ismail Elezi, Jiankang Deng

CVPR 2024poster
24
citations

Vlogger: Make Your Dream A Vlog

Shaobin Zhuang, Kunchang Li, Xinyuan Chen et al.

CVPR 2024poster

VLP: Vision Language Planning for Autonomous Driving

Chenbin Pan, Burhan Yaman, Tommaso Nesti et al.

CVPR 2024poster
127
citations

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye

CVPR 2024poster

VMINer: Versatile Multi-view Inverse Rendering with Near- and Far-field Light Sources

Fan Fei, Jiajun Tang, Ping Tan et al.

CVPR 2024highlight

VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis

Linshan Wu, Jia-Xin Zhuang, Hao Chen

CVPR 2024poster
70
citations

Volumetric Environment Representation for Vision-Language Navigation

Liu, Wenguan Wang, Yi Yang

CVPR 2024highlight

VOODOO 3D: Volumetric Portrait Disentanglement For One-Shot 3D Head Reenactment

Phong Tran, Egor Zakharov, Long Nhat Ho et al.

CVPR 2024poster
29
citations

VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation

Yang Chen, Yingwei Pan, haibo yang et al.

CVPR 2024poster
30
citations

VRetouchEr: Learning Cross-frame Feature Interdependence with Imperfection Flow for Face Retouching in Videos

Wen Xue, Le Jiang, Lianxin Xie et al.

CVPR 2024poster
1
citations

VRP-SAM: SAM with Visual Reference Prompt

Yanpeng Sun, Jiahui Chen, Shan Zhang et al.

CVPR 2024poster

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Ziyang Luo, Nian Liu, Wangbo Zhao et al.

CVPR 2024poster

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

Zihua Liu, Hiroki Sakuma, Masatoshi Okutomi

CVPR 2024poster

VS: Reconstructing Clothed 3D Human from Single Image via Vertex Shift

Leyuan Liu, Yuhan Li, Yunqi Gao et al.

CVPR 2024poster

VTimeLLM: Empower LLM to Grasp Video Moments

Bin Huang, Xin Wang, Hong Chen et al.

CVPR 2024highlight
244
citations

VTQA: Visual Text Question Answering via Entity Alignment and Cross-Media Reasoning

Kang Chen, Xiangqian Wu

CVPR 2024poster
19
citations

WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects Under Occlusion

Khiem Vuong, N. Dinesh Reddy, Robert Tamburo et al.

CVPR 2024poster
3
citations

WANDR: Intention-guided Human Motion Generation

Markos Diomataris, Nikos Athanasiou, Omid Taheri et al.

CVPR 2024poster

WateRF: Robust Watermarks in Radiance Fields for Protection of Copyrights

Youngdong Jang, Dong In Lee, MinHyuk Jang et al.

CVPR 2024poster
25
citations

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

Peifei Zhu, Tsubasa Takahashi, Hirokatsu Kataoka

CVPR 2024poster
34
citations

WaveFace: Authentic Face Restoration with Efficient Frequency Recovery

Yunqi Miao, Jiankang Deng, Jungong Han

CVPR 2024poster

Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration

Chen Zhao, Weiling Cai, Chenyu Dong et al.

CVPR 2024poster

WaveMo: Learning Wavefront Modulations to See Through Scattering

Mingyang Xie, Haiyun Guo, Brandon Y. Feng et al.

CVPR 2024poster
7
citations

Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu et al.

CVPR 2024poster

Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling

Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, A. N. Rajagopalan

CVPR 2024poster

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

Xingqun Qi, Jiahao Pan, Peng Li et al.

CVPR 2024poster

Weakly Supervised Monocular 3D Detection with a Single-View Image

Xueying Jiang, Sheng Jin, Lewei Lu et al.

CVPR 2024poster
12
citations

Weakly Supervised Point Cloud Semantic Segmentation via Artificial Oracle

Hyeokjun Kweon, Jihun Kim, Kuk-Jin Yoon

CVPR 2024poster

Weakly Supervised Video Individual Counting

Xinyan Liu, Guorong Li, Yuankai Qi et al.

CVPR 2024poster

Weak-to-Strong 3D Object Detection with X-Ray Distillation

Alexander Gambashidze, Aleksandr Dadukin, Maksim Golyadkin et al.

CVPR 2024poster
6
citations

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

Soyong Shin, Juyong Kim, Eni Halilaj et al.

CVPR 2024poster

What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation

Yihua Cheng, Yaning Zhu, Zongji Wang et al.

CVPR 2024poster

What How and When Should Object Detectors Update in Continually Changing Test Domains?

Jayeon Yoo, Dongkwan Lee, Inseop Chung et al.

CVPR 2024poster
15
citations

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.

CVPR 2024poster

What Sketch Explainability Really Means for Downstream Tasks?

Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia et al.

CVPR 2024poster

What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

Brian Chen, Nina Shvetsova, Andrew Rouditchenko et al.

CVPR 2024poster