CVPR Papers

5,589 papers found • Page 57 of 112

Visual Persona: Foundation Model for Full-Body Human Customization

Jisu Nam, Soowon Son, Zhan Xu et al.

CVPR 2025posterarXiv:2503.15406
6
citations

Visual Prompting for One-shot Controllable Video Editing without Inversion

Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.

CVPR 2025posterarXiv:2504.14335

Visual Representation Learning through Causal Intervention for Controllable Image Editing

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025highlight

VITED: Video Temporal Evidence Distillation

Yujie Lu, Yale Song, Lorenzo Torresani et al.

CVPR 2025posterarXiv:2503.12855
2
citations

ViUniT: Visual Unit Tests for More Robust Visual Programming

Artemis Panagopoulou, Honglu Zhou, silvio savarese et al.

CVPR 2025posterarXiv:2412.08859
2
citations

VL2Lite: Task-Specific Knowledge Distillation from Large Vision-Language Models to Lightweight Networks

Jinseong Jang, Chunfei Ma, Byeongwon Lee

CVPR 2025poster

VladVA: Discriminative Fine-tuning of LVLMs

Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.

CVPR 2025posterarXiv:2412.04378
11
citations

VLMs-Guided Representation Distillation for Efficient Vision-Based Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2025poster

VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis

Enric Corona, Andrei Zanfir, Eduard Gabriel Bazavan et al.

CVPR 2025posterarXiv:2403.08764
46
citations

VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary

Kevin Qinghong Lin, Mike Zheng Shou

CVPR 2025posterarXiv:2503.09402

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Lei Li, wei yuancheng, Zhihui Xie et al.

CVPR 2025highlightarXiv:2411.17451

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang et al.

CVPR 2025posterarXiv:2412.01822

VoCo-LLaMA: Towards Vision Compression with Large Language Models

Xubing Ye, Yukang Gan, Xiaoke Huang et al.

CVPR 2025posterarXiv:2406.12275

VODiff: Controlling Object Visibility Order in Text-to-Image Generation

Dong Liang, Jinyuan Jia, Yuhao Liu et al.

CVPR 2025poster
3
citations

VolFormer: Explore More Comprehensive Cube Interaction for Hyperspectral Image Restoration and Beyond

Dabing Yu, Zheng Gao

CVPR 2025poster

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261

Volumetrically Consistent 3D Gaussian Rasterization

Chinmay Talegaonkar, Yash Belhe, Ravi Ramamoorthi et al.

CVPR 2025highlightarXiv:2412.03378

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser et al.

CVPR 2025posterarXiv:2409.02482

VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow

Yancong Lin, Shiming Wang, Liangliang Nan et al.

CVPR 2025posterarXiv:2503.22328
4
citations

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025posterarXiv:2506.05563

VSNet: Focusing on the Linguistic Characteristics of Sign Language

Yuhao Li, Xinyue Chen, Hongkai Li et al.

CVPR 2025poster
1
citations

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.

CVPR 2025posterarXiv:2503.12077
5
citations

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction

Zijian He, Yuwei Ning, Yipeng Qin et al.

CVPR 2025posterarXiv:2503.12165
10
citations

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.

CVPR 2025posterarXiv:2408.12340
10
citations

Watermarking One for All: A Robust Watermarking Scheme Against Partial Image Theft

Gaozhi Liu, Silu Cao, Zhenxing Qian et al.

CVPR 2025poster
3
citations

Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation

Hao Li, Ju Dai, Xin Zhao et al.

CVPR 2025posterarXiv:2505.23290
3
citations

Wavelet and Prototype Augmented Query-based Transformer for Pixel-level Surface Defect Detection

Feng Yan, Xiaoheng Jiang, Yang Lu et al.

CVPR 2025poster
3
citations

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

Fu Feng, Yucheng Xie, Jing Wang et al.

CVPR 2025posterarXiv:2406.17503

Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data

Lilin Zhang, Chengpei Wu, Ning Yang

CVPR 2025posterarXiv:2503.11032

Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion

Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.

CVPR 2025poster

Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models

Quan Zhang, Jinwei Fang, Rui Yuan et al.

CVPR 2025posterarXiv:2411.08466
6
citations

WeakMCN: Multi-task Collaborative Network for Weakly Supervised Referring Expression Comprehension and Segmentation

Silin Cheng, Yang Liu, Xinwei He et al.

CVPR 2025posterarXiv:2505.18686
3
citations

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

Yang Wu, Yun Zhu, Kaihua Zhang et al.

CVPR 2025posterarXiv:2504.13561

WeGen: A Unified Model for Interactive Multimodal Generation as We Chat

Zhipeng Huang, Shaobin Zhuang, Canmiao Fu et al.

CVPR 2025posterarXiv:2503.01115

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Zongjian Li, Bin Lin, Yang Ye et al.

CVPR 2025posterarXiv:2411.17459

What Makes a Good Dataset for Knowledge Distillation?

Logan Frank, Jim Davis

CVPR 2025posterarXiv:2411.12817
3
citations

What’s in the Image? A Deep-Dive into the Vision of Vision Language Models

Omri Kaduri, Shai Bagon, Tali Dekel

CVPR 2025posterarXiv:2411.17491

When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach

Vaibhav Rathore, Shubhranil B, Saikat Dutta et al.

CVPR 2025posterarXiv:2503.14897

When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning

Yang Liu, Qianqian Xu, Peisong Wen et al.

CVPR 2025posterarXiv:2503.15096
11
citations

Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content

Haoyue Bai, Yiyou Sun, Wei Cheng et al.

CVPR 2025posterarXiv:2505.01008

Where the Devil Hides: Deepfake Detectors Can No Longer Be Trusted

Shuaiwei Yuan, Junyu Dong, Yuezun Li

CVPR 2025posterarXiv:2505.08255
2
citations

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

CVPR 2025highlightarXiv:2411.08753
3
citations

WildAvatar: Learning In-the-wild 3D Avatars from the Web

Zihao Huang, Shoukang Hu, Guangcong Wang et al.

CVPR 2025posterarXiv:2407.02165
1
citations

WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments

Jianhao Zheng, Zihan Zhu, Valentin Bieri et al.

CVPR 2025posterarXiv:2504.03886
29
citations

WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild

Rolandos Alexandros Potamias, Jinglei Zhang, Jiankang Deng et al.

CVPR 2025posterarXiv:2409.12259
53
citations

WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression

Yu Mao, Jun Wang, Nan Guan et al.

CVPR 2025posterarXiv:2503.18074
4
citations

WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2025highlight

WISNet: Pseudo Label Generation on Unbalanced and Patch Annotated Waste Images

Shifan Zhang, Hongzi Zhu, Yinan He et al.

CVPR 2025poster
1
citations

Wonderland: Navigating 3D Scenes from a Single Image

Hanwen Liang, Junli Cao, Vidit Goel et al.

CVPR 2025posterarXiv:2412.12091
54
citations

WonderWorld: Interactive 3D Scene Generation from a Single Image

Hong-Xing Yu, Haoyi Duan, Charles Herrmann et al.

CVPR 2025highlightarXiv:2406.09394
120
citations