CVPR Poster Papers

4,874 papers found • Page 96 of 98

UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video Enhancement

yaofeng xie, Lingwei Kong, Kai Chen et al.

CVPR 2024posterarXiv:2404.14542
21
citations

UV-IDM: Identity-Conditioned Latent Diffusion Model for Face UV-Texture Generation

Hong Li, Yutang Feng, Song Xue et al.

CVPR 2024poster

Validating Privacy-Preserving Face Recognition under a Minimum Assumption

Hui Zhang, Xingbo Dong, YenLungLai et al.

CVPR 2024poster

VAREN: Very Accurate and Realistic Equine Network

Silvia Zuffi, Ylva Mellbin, Ci Li et al.

CVPR 2024poster
23
citations

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

Jiaqi Lin, Zhihao Li, Xiao Tang et al.

CVPR 2024posterarXiv:2402.17427

VCoder: Versatile Vision Encoders for Multimodal Large Language Models

Jitesh Jain, Jianwei Yang, Humphrey Shi

CVPR 2024posterarXiv:2312.14233
48
citations

Vector Graphics Generation via Mutually Impulsed Dual-domain Diffusion

Zhongyin Zhao, Ye Chen, Zhangli Hu et al.

CVPR 2024poster

Versatile Medical Image Segmentation Learned from Multi-Source Datasets via Model Self-Disambiguation

Xiaoyang Chen, Hao Zheng, Yuemeng LI et al.

CVPR 2024posterarXiv:2311.10696
15
citations

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

Gengyu Zhang, Hao Tang, Yan Yan

CVPR 2024posterarXiv:2404.02176

V?: Guided Visual Search as a Core Mechanism in Multimodal LLMs

Penghao Wu, Saining Xie

CVPR 2024posterarXiv:2312.14135
327
citations

VicTR: Video-conditioned Text Representations for Activity Recognition

Kumara Kahatapitiya, Anurag Arnab, Arsha Nagrani et al.

CVPR 2024posterarXiv:2304.02560
36
citations

Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video

Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.

CVPR 2024poster

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926

VideoBooth: Diffusion-based Video Generation with Image Prompts

Yuming Jiang, Tianxing Wu, Shuai Yang et al.

CVPR 2024posterarXiv:2312.00777
118
citations

VideoCon: Robust Video-Language Alignment via Contrast Captions

Hritik Bansal, Yonatan Bitton, Idan Szpektor et al.

CVPR 2024posterarXiv:2311.10111
28
citations

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Haoxin Chen, Yong Zhang, Xiaodong Cun et al.

CVPR 2024posterarXiv:2401.09047

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

XuDong Wang, Ishan Misra, Ziyun Zeng et al.

CVPR 2024posterarXiv:2308.14710
36
citations

Video Frame Interpolation via Direct Synthesis with the Event-based Reference

Yuhan Liu, Yongjian Deng, Hao Chen et al.

CVPR 2024poster

VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

Syed Talal Wasim, Muzammal Naseer, Salman Khan et al.

CVPR 2024poster

Video Harmonization with Triplet Spatio-Temporal Variation Patterns

Zonghui Guo, XinYu Han, Jie Zhang et al.

CVPR 2024poster

Video Interpolation with Diffusion Models

Siddhant Jain, Daniel Watson, Aleksander Holynski et al.

CVPR 2024posterarXiv:2404.01203
63
citations

VideoLLM-online: Online Video Large Language Model for Streaming Video

Joya Chen, Zhaoyang Lv, Shiwei Wu et al.

CVPR 2024posterarXiv:2406.11816
109
citations

VideoMAC: Video Masked Autoencoders Meet ConvNets

Gensheng Pei, Tao Chen, Xiruo Jiang et al.

CVPR 2024posterarXiv:2402.19082
20
citations

Video-P2P: Video Editing with Cross-attention Control

Shaoteng Liu, Yuechen Zhang, Wenbo Li et al.

CVPR 2024posterarXiv:2303.04761
309
citations

Video Prediction by Modeling Videos as Continuous Multi-Dimensional Processes

Gaurav Shrivastava, Abhinav Shrivastava

CVPR 2024poster
16
citations

Video ReCap: Recursive Captioning of Hour-Long Videos

Md Mohaiminul Islam, Vu Bao Ngan Ho, Xitong Yang et al.

CVPR 2024posterarXiv:2402.13250
82
citations

Video Recognition in Portrait Mode

Mingfei Han, Linjie Yang, Xiaojie Jin et al.

CVPR 2024posterarXiv:2312.13746

VideoRF: Rendering Dynamic Radiance Fields as 2D Feature Video Streams

Liao Wang, Kaixin Yao, Chengcheng Guo et al.

CVPR 2024posterarXiv:2312.01407
21
citations

Video Super-Resolution Transformer with Masked Inter&Intra-Frame Attention

Xingyu Zhou, Leheng Zhang, Xiaorui Zhao et al.

CVPR 2024posterarXiv:2401.06312

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

Yuchao Gu, Yipin Zhou, Bichen Wu et al.

CVPR 2024posterarXiv:2312.02087
63
citations

VidLA: Video-Language Alignment at Scale

Mamshad Nayeem Rizve, Fan Fei, Jayakrishnan Unnikrishnan et al.

CVPR 2024posterarXiv:2403.14870
8
citations

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024posterarXiv:2403.13347

VidToMe: Video Token Merging for Zero-Shot Video Editing

Xirui Li, Chao Ma, Xiaokang Yang et al.

CVPR 2024posterarXiv:2312.10656
89
citations

View-decoupled Transformer for Person Re-identification under Aerial-ground Camera Network

Quan Zhang, Lei Wang, Vishal M. Patel et al.

CVPR 2024posterarXiv:2403.14513
32
citations

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024posterarXiv:2403.01807

View From Above: Orthogonal-View aware Cross-view Localization

Shan Wang, Chuong Nguyen, Jiawei Liu et al.

CVPR 2024poster

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Xianghui Yang, Gil Avraham, Yan Zuo et al.

CVPR 2024posterarXiv:2402.18842

Viewpoint-Aware Visual Grounding in 3D Scenes

Xiangxi Shi, Zhonghua Wu, Stefan Lee

CVPR 2024poster

ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification

Jiangbo Shi, Chen Li, Tieliang Gong et al.

CVPR 2024posterarXiv:2502.08391

VILA: On Pre-training for Visual Language Models

Ji Lin, Danny Yin, Wei Ping et al.

CVPR 2024posterarXiv:2312.07533
685
citations

VINECS: Video-based Neural Character Skinning

Zhouyingcheng Liao, Vladislav Golyanik, Marc Habermann et al.

CVPR 2024posterarXiv:2307.00842

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Mu Cai, Haotian Liu, Siva Mustikovela et al.

CVPR 2024posterarXiv:2312.00784
153
citations

Virtual Immunohistochemistry Staining for Histological Images Assisted by Weakly-supervised Learning

Jiahan Li, Jiuyang Dong, Shenjin Huang et al.

CVPR 2024poster

Vision-and-Language Navigation via Causal Learning

Liuyi Wang, Zongtao He, Ronghao Dang et al.

CVPR 2024posterarXiv:2404.10241
41
citations

VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens

Fan Ma, Xiaojie Jin, Heng Wang et al.

CVPR 2024posterarXiv:2312.08870

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024posterarXiv:2311.17919

Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning

Wenjin Hou, Shiming Chen, Shuhuang Chen et al.

CVPR 2024posterarXiv:2404.14808

Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval

Young Kyun Jang, Donghyun Kim, Zihang Meng et al.

CVPR 2024posterarXiv:2404.15516
21
citations

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Yunhao Ge, Xiaohui Zeng, Jacob Huffman et al.

CVPR 2024posterarXiv:2404.19752
33
citations

Visual In-Context Prompting

Feng Li, Qing Jiang, Hao Zhang et al.

CVPR 2024posterarXiv:2311.13601
52
citations