CVPR Highlight Papers

712 papers found • Page 8 of 15

Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks

Tiago Novello, Diana Aldana Moreno, André Araujo et al.

CVPR 2025highlightarXiv:2407.21121
10
citations

Type-R: Automatically Retouching Typos for Text-to-Image Generation

Wataru Shimoda, Naoto Inoue, Daichi Haraguchi et al.

CVPR 2025highlight
1
citations

UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning

Weiqi Yan, Lvhai Chen, Huaijia Kou et al.

CVPR 2025highlight
2
citations

UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models

Yuning Han, Bingyin Zhao, Rui Chu et al.

CVPR 2025highlightarXiv:2412.11441
5
citations

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Zixuan Chen, Yujin Wang, Xin Cai et al.

CVPR 2025highlight
8
citations

UMotion: Uncertainty-driven Human Motion Estimation from Inertial and Ultra-wideband Units

Huakun Liu, Hiroki Ota, Xin Wei et al.

CVPR 2025highlightarXiv:2505.09393
1
citations

Understanding Multi-layered Transmission Matrices

Marina Alterman, Anat Levin

CVPR 2025highlight
1
citations

Understanding Multi-Task Activities from Single-Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2025highlight
4
citations

Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video

David Yifan Yao, Albert J. Zhai, Shenlong Wang

CVPR 2025highlightarXiv:2503.21761

Unified Reconstruction of Static and Dynamic Scenes from Events

Qiyao Gao, Peiqi Duan, Hanyue Lou et al.

CVPR 2025highlight

UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

Yiheng Li, RuiBing Hou, Hong Chang et al.

CVPR 2025highlightarXiv:2411.16781
14
citations

UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

Xi Chen, Zhifei Zhang, He Zhang et al.

CVPR 2025highlight
70
citations

UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior

I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu et al.

CVPR 2025highlightarXiv:2501.13134
16
citations

Universal Scene Graph Generation

Shengqiong Wu, Hao Fei, Tat-seng Chua

CVPR 2025highlightarXiv:2503.15005
3
citations

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu et al.

CVPR 2025highlightarXiv:2412.01027

Unlocking Generalization Power in LiDAR Point Cloud Registration

Zhenxuan Zeng, Qiao Wu, Xiyu Zhang et al.

CVPR 2025highlightarXiv:2503.10149

Unsupervised Continual Domain Shift Learning with Multi-Prototype Modeling

Haopeng Sun, Yingwei Zhang, Lumin Xu et al.

CVPR 2025highlight
2
citations

Unveiling Differences in Generative Models: A Scalable Differential Clustering Approach

Jingwei Zhang, Mohammad Jalali, Cheuk Ting Li et al.

CVPR 2025highlight
8
citations

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

Kang Chen, Jiyuan Zhang, Zecheng Hao et al.

CVPR 2025highlightarXiv:2411.10504
3
citations

v-CLR: View-Consistent Learning for Open-World Instance Segmentation

Chang-Bin Zhang, Jinhong Ni, Yujie Zhong et al.

CVPR 2025highlightarXiv:2504.01383
2
citations

VEU-Bench: Towards Comprehensive Understanding of Video Editing

Bozheng Li, Yongliang Wu, YI LU et al.

CVPR 2025highlight
1
citations

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Sili Chen, Hengkai Guo, Shengnan Zhu et al.

CVPR 2025highlight

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Chaoyou Fu, Yuhan Dai, Yongdong Luo et al.

CVPR 2025highlightarXiv:2405.21075
858
citations

VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Hanyang Wang, Fangfu Liu, Jiawei Chi et al.

CVPR 2025highlightarXiv:2504.01956
11
citations

VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge

Vishwesh Nath, Wenqi Li, Dong Yang et al.

CVPR 2025highlight
29
citations

Visual Representation Learning through Causal Intervention for Controllable Image Editing

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025highlight

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Lei Li, wei yuancheng, Zhihui Xie et al.

CVPR 2025highlight

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlight

Volumetrically Consistent 3D Gaussian Rasterization

Chinmay Talegaonkar, Yash Belhe, Ravi Ramamoorthi et al.

CVPR 2025highlight

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

CVPR 2025highlightarXiv:2411.08753
3
citations

WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2025highlight

WonderWorld: Interactive 3D Scene Generation from a Single Image

Hong-Xing Yu, Haoyi Duan, Charles Herrmann et al.

CVPR 2025highlight
120
citations

World-consistent Video Diffusion with Explicit 3D Modeling

Qihang Zhang, Shuangfei Zhai, Miguel Ángel Bautista et al.

CVPR 2025highlight

X-Dyna: Expressive Dynamic Human Image Animation

Di Chang, Hongyi Xu, You Xie et al.

CVPR 2025highlightarXiv:2501.10021
14
citations

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

Fengxiang Wang, hongzhen wang, Zonghao Guo et al.

CVPR 2025highlightarXiv:2503.23771
24
citations

Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding

seil kang, Jinyeong Kim, Junhyeok Kim et al.

CVPR 2025highlightarXiv:2503.06287
31
citations

Your ViT is Secretly an Image Segmentation Model

Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans et al.

CVPR 2025highlightarXiv:2503.19108
24
citations

You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale

Baorui Ma, Huachen Gao, Haoge Deng et al.

CVPR 2025highlightarXiv:2412.06699
49
citations

3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation

Zidu Wang, Xiangyu Zhu, Tianshuo Zhang et al.

CVPR 2024highlightarXiv:2312.00311

3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos

Jiakai Sun, Han Jiao, Guangyuan Li et al.

CVPR 2024highlight

3D Human Pose Perception from Egocentric Stereo Videos

Hiroyasu Akada, Jian Wang, Vladislav Golyanik et al.

CVPR 2024highlight

3DInAction: Understanding Human Actions in 3D Point Clouds

Yizhak Ben-Shabat, Oren Shrout, Stephen Gould

CVPR 2024highlight

4D-DRESS: A 4D Dataset of Real-World Human Clothing With Semantic Annotations

Wenbo Wang, Hsuan-I Ho, Chen Guo et al.

CVPR 2024highlight
43
citations

Abductive Ego-View Accident Video Understanding for Safe Driving Perception

Jianwu Fang, Lei-lei Li, Junfei Zhou et al.

CVPR 2024highlight

Absolute Pose from One or Two Scaled and Oriented Features

Jonathan Ventura, Zuzana Kukelova, Torsten Sattler et al.

CVPR 2024highlight

Accept the Modality Gap: An Exploration in the Hyperbolic Space

Sameera Ramasinghe, Violetta Shevchenko, Gil Avraham et al.

CVPR 2024highlight

Active Domain Adaptation with False Negative Prediction for Object Detection

Yuzuru Nakamura, Yasunori Ishii, Takayoshi Yamashita

CVPR 2024highlight

Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images

Chaoqin Huang, Aofan Jiang, Jinghao Feng et al.

CVPR 2024highlight

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

Tao Tang, Guangrun Wang, Yixing Lao et al.

CVPR 2024highlightarXiv:2402.17483
20
citations

Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models

Huan Ling, Seung Wook Kim, Antonio Torralba et al.

CVPR 2024highlight