CVPR 2025 Highlight Papers

388 papers found • Page 1 of 8

3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes

Jan Held, Renaud Vandeghen, Abdullah J Hamdi et al.

CVPR 2025highlightarXiv:2411.14974

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

Zhaoxi Chen, Jiaxiang Tang, Yuhao Dong et al.

CVPR 2025highlightarXiv:2409.12957

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

Chaoyang Wang, Peiye Zhuang, Tuan Duc Ngo et al.

CVPR 2025highlightarXiv:2412.04462
18
citations

Active Hyperspectral Imaging Using an Event Camera

Bohan Yu, Jinxiu Liang, Zhuofeng Wang et al.

CVPR 2025highlight

AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Yuanbin Man, Ying Huang, Chengming Zhang et al.

CVPR 2025highlightarXiv:2411.12593

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Xianrui Li, Yufei Cui, Jun Li et al.

CVPR 2025highlightarXiv:2505.10649

AIpparel: A Multimodal Foundation Model for Digital Garments

Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan et al.

CVPR 2025highlightarXiv:2412.03937
5
citations

ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency

Dong Wei, Xiaoning Sun, Xizhan Gao et al.

CVPR 2025highlight

Align3R: Aligned Monocular Depth Estimation for Dynamic Videos

Edward LOO, Tianyu HUANG, Peng Li et al.

CVPR 2025highlightarXiv:2412.03079
57
citations

All-directional Disparity Estimation for Real-world QPD Images

Hongtao Yu, Shaohui Song, Lihu Sun et al.

CVPR 2025highlight

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

Ashmal Vayani, Dinura Dissanayake, Hasindri Watawana et al.

CVPR 2025highlightarXiv:2411.16508
42
citations

All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising

Xiaoling Zhou, Zhemg Lee, Wei Ye et al.

CVPR 2025highlight

Annotation Ambiguity Aware Semi-Supervised Medical Image Segmentation

Suruchi Kumari, Pravendra Singh

CVPR 2025highlight
3
citations

AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities

Guillaume Astruc, Nicolas Gonthier, Clement Mallet et al.

CVPR 2025highlightarXiv:2412.14123
34
citations

A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition

Duosheng Chen, Shihao Zhou, Jinshan Pan et al.

CVPR 2025highlight
5
citations

ArcPro: Architectural Programs for Structured 3D Abstraction of Sparse Points

Qirui Huang, Runze Zhang, Kangjun Liu et al.

CVPR 2025highlightarXiv:2503.02745
3
citations

ARM: Appearance Reconstruction Model for Relightable 3D Generation

Xiang Feng, Chang Yu, Zoubin Bi et al.

CVPR 2025highlightarXiv:2411.10825
7
citations

Assessing and Learning Alignment of Unimodal Vision and Language Models

Le Zhang, Qian Yang, Aishwarya Agrawal

CVPR 2025highlightarXiv:2412.04616
14
citations

Augmented Deep Contexts for Spatially Embedded Video Coding

Yifan Bian, Chuanbo Tang, Li Li et al.

CVPR 2025highlightarXiv:2505.05309
6
citations

A Unified Approach to Interpreting Self-supervised Pre-training Methods for 3D Point Clouds via Interactions

Qiang Li, Jian Ruan, Fanghao Wu et al.

CVPR 2025highlight

BADGR: Bundle Adjustment Diffusion Conditioned by Gradients for Wide-Baseline Floor Plan Reconstruction

Yuguang Li, Ivaylo Boyadzhiev, Zixuan Liu et al.

CVPR 2025highlightarXiv:2503.19340

Balanced Rate-Distortion Optimization in Learned Image Compression

Yichi Zhang, Zhihao Duan, Yuning Huang et al.

CVPR 2025highlightarXiv:2502.20161
3
citations

BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance

Xin Ye, Burhan Yaman, Sheng Cheng et al.

CVPR 2025highlightarXiv:2502.19694

BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing

Yunqi Gu, Ian Huang, Jihyeon Je et al.

CVPR 2025highlightarXiv:2504.01786

Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB

Nikhil Behari, Aaron Young, Siddharth Somasundaram et al.

CVPR 2025highlightarXiv:2411.19474

Boost Your Human Image Generation Model via Direct Preference Optimization

Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee

CVPR 2025highlightarXiv:2405.20216
8
citations

Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy

Zesen Cheng, Hang Zhang, Kehan Li et al.

CVPR 2025highlight

BWFormer: Building Wireframe Reconstruction from Airborne LiDAR Point Cloud with Transformer

Yuzhou Liu, Lingjie Zhu, Hanqiao Ye et al.

CVPR 2025highlight
3
citations

CADDreamer: CAD Object Generation from Single-view Images

Yuan Li, Cheng Lin, Yuan Liu et al.

CVPR 2025highlightarXiv:2502.20732
10
citations

Can Generative Video Models Help Pose Estimation?

Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.

CVPR 2025highlightarXiv:2412.16155

Can Machines Understand Composition? Dataset and Benchmark for Photographic Image Composition Embedding and Understanding

Zhaoran Zhao, Peng Lu, Anran Zhang et al.

CVPR 2025highlight

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

Jingshun Huang, Haitao Lin, Tianyu Wang et al.

CVPR 2025highlightarXiv:2504.11230

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

Yuan Zhou, Qingshan Xu, Jiequan Cui et al.

CVPR 2025highlightarXiv:2411.16170

CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design

Weitao Feng, Hang Zhou, Jing Liao et al.

CVPR 2025highlightarXiv:2504.19478
4
citations

CASP: Compression of Large Multimodal Models Based on Attention Sparsity

Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.

CVPR 2025highlightarXiv:2503.05936
2
citations

CCIN: Compositional Conflict Identification and Neutralization for Composed Image Retrieval

Likai Tian, Jian Zhao, Zechao Hu et al.

CVPR 2025highlight
10
citations

CH3Depth: Efficient and Flexible Depth Foundation Model with Flow Matching

Jiaqi Li, Yiran Wang, Jinghong Zheng et al.

CVPR 2025highlight

Change3D: Revisiting Change Detection and Captioning from A Video Modeling Perspective

Duowang Zhu, Xiaohu Huang, Haiyan Huang et al.

CVPR 2025highlightarXiv:2503.18803
13
citations

CheckManual: A New Challenge and Benchmark for Manual-based Appliance Manipulation

Yuxing Long, Jiyao Zhang, Mingjie Pan et al.

CVPR 2025highlightarXiv:2506.09343
2
citations

Circumventing Shortcuts in Audio-visual Deepfake Detection Datasets with Unsupervised Learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata et al.

CVPR 2025highlightarXiv:2412.00175
9
citations

ClimbingCap: Multi-Modal Dataset and Method for Rock Climbing in World Coordinate

Ming Yan, Xincheng Lin, Yuhua Luo et al.

CVPR 2025highlightarXiv:2503.21268

CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering

Tianyu Huai, Jie Zhou, Xingjiao Wu et al.

CVPR 2025highlightarXiv:2503.00413
10
citations

Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

Zichen Miao, WEI CHEN, Qiang Qiu

CVPR 2025highlightarXiv:2503.18337

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation

Wei Chen, Lin Li, Yongqi Yang et al.

CVPR 2025highlightarXiv:2406.10462
12
citations

Compositional Caching for Training-free Open-vocabulary Attribute Detection

Marco Garosi, Alessandro Conti, Gaowen Liu et al.

CVPR 2025highlightarXiv:2503.19145
3
citations

Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers

Jung-Ho Hong, Ho-Joong Kim, Kyu-Sung Jeon et al.

CVPR 2025highlightarXiv:2507.04388

Context-Aware Multimodal Pretraining

Karsten Roth, Zeynep Akata, Dima Damen et al.

CVPR 2025highlightarXiv:2411.15099
4
citations

CoSER: Towards Consistent Dense Multiview Text-to-Image Generator for 3D Creation

Bonan Li, Zicheng Zhang, Xingyi Yang et al.

CVPR 2025highlight
1
citations

COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts

Jiansheng Li, Xingxuan Zhang, Hao Zou et al.

CVPR 2025highlightarXiv:2504.10158
1
citations

Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting

Hanxi Liu, Yifang Men, Zhouhui Lian

CVPR 2025highlightarXiv:2504.20403
1
citations
Previous
123...8
Next