Highlight Papers

975 papers found • Page 13 of 20

Unified Multimodal Understanding via Byte-Pair Visual Encoding

Wanpeng Zhang, Yicheng Feng, Hao Luo et al.

ICCV 2025highlight
7
citations

Unified Reconstruction of Static and Dynamic Scenes from Events

Qiyao Gao, Peiqi Duan, Hanyue Lou et al.

CVPR 2025highlight

UniPhys: Unified Planner and Controller with Diffusion for Flexible Physics-Based Character Control

Yan Wu, Korrawe Karunratanakul, Zhengyi Luo et al.

ICCV 2025highlight

UniPose: A Unified Multimodal Framework for Human Pose Comprehension, Generation and Editing

Yiheng Li, RuiBing Hou, Hong Chang et al.

CVPR 2025highlight
14
citations

UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

Xi Chen, Zhifei Zhang, He Zhang et al.

CVPR 2025highlight
70
citations

UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior

I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu et al.

CVPR 2025highlightarXiv:2501.13134
16
citations

Universal Scene Graph Generation

Shengqiong Wu, Hao Fei, Tat-seng Chua

CVPR 2025highlight
3
citations

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu et al.

CVPR 2025highlight

Unleashing Vecset Diffusion Model for Fast Shape Generation

Zeqiang Lai, Zhao Yunfei, Zibo Zhao et al.

ICCV 2025highlightarXiv:2503.16302
14
citations

Unlocking Generalization Power in LiDAR Point Cloud Registration

Zhenxuan Zeng, Qiao Wu, Xiyu Zhang et al.

CVPR 2025highlight

UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI

Fangwei Zhong, Kui Wu, Churan Wang et al.

ICCV 2025highlight

Unsupervised Continual Domain Shift Learning with Multi-Prototype Modeling

Haopeng Sun, Yingwei Zhang, Lumin Xu et al.

CVPR 2025highlight
2
citations

Unsupervised Joint Learning of Optical Flow and Intensity with Event Cameras

Shuang Guo, Friedhelm Hamann, Guillermo Gallego

ICCV 2025highlight

Unveiling Differences in Generative Models: A Scalable Differential Clustering Approach

Jingwei Zhang, Mohammad Jalali, Cheuk Ting Li et al.

CVPR 2025highlight
8
citations

UnZipLoRA: Separating Content and Style from a Single Image

Chang Liu, Viraj Shah, Aiyu Cui et al.

ICCV 2025highlight

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

Kang Chen, Jiyuan Zhang, Zecheng Hao et al.

CVPR 2025highlight
3
citations

v-CLR: View-Consistent Learning for Open-World Instance Segmentation

Chang-Bin Zhang, Jinhong Ni, Yujie Zhong et al.

CVPR 2025highlightarXiv:2504.01383
2
citations

VEU-Bench: Towards Comprehensive Understanding of Video Editing

Bozheng Li, Yongliang Wu, YI LU et al.

CVPR 2025highlight
1
citations

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos

Sili Chen, Hengkai Guo, Shengnan Zhu et al.

CVPR 2025highlight

Video Individual Counting for Moving Drones

Yaowu Fan, Jia Wan, Tao Han et al.

ICCV 2025highlight
3
citations

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

Chaoyou Fu, Yuhan Dai, Yongdong Luo et al.

CVPR 2025highlight
858
citations

Video Motion Graphs

Haiyang Liu, Zhan Xu, Fating Hong et al.

ICCV 2025highlightarXiv:2503.20218
4
citations

VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Hanyang Wang, Fangfu Liu, Jiawei Chi et al.

CVPR 2025highlight
11
citations

VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge

Vishwesh Nath, Wenqi Li, Dong Yang et al.

CVPR 2025highlight
29
citations

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlight

Visual Representation Learning through Causal Intervention for Controllable Image Editing

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025highlight

Visual Test-time Scaling for GUI Agent Grounding

Tiange Luo, Lajanugen Logeswaran, Justin Johnson et al.

ICCV 2025highlight
10
citations

VL-RewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models

Lei Li, wei yuancheng, Zhihui Xie et al.

CVPR 2025highlight

VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory

Runjia Li, Philip Torr, Andrea Vedaldi et al.

ICCV 2025highlight
20
citations

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlight

Volumetrically Consistent 3D Gaussian Rasterization

Chinmay Talegaonkar, Yash Belhe, Ravi Ramamoorthi et al.

CVPR 2025highlight

VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions

Marko Mihajlovic, Siwei Zhang, Gen Li et al.

ICCV 2025highlightarXiv:2506.23236
3
citations

VRM: Knowledge Distillation via Virtual Relation Matching

Weijia Zhang, Fei Xie, Weidong Cai et al.

ICCV 2025highlight

Wasserstein Style Distribution Analysis and Transform for Stylized Image Generation

Xi Yu, Xiang Gu, Zhihao Shi et al.

ICCV 2025highlight
1
citations

WeaveSeg: Iterative Contrast-weaving and Spectral Feature-refining for Nuclei Instance Segmentation

Jiajia Li, Huisi Wu, Jing Qin

ICCV 2025highlight

What to Distill? Fast Knowledge Distillation with Adaptive Sampling

Byungchul Chae, Seonyeong Heo

ICCV 2025highlight

When Confidence Fails: Revisiting Pseudo-Label Selection in Semi-supervised Semantic Segmentation

Pan Liu, Jinshi Liu

ICCV 2025highlightarXiv:2509.16704
1
citations

Where, What, Why: Towards Explainable Driver Attention Prediction

Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.

ICCV 2025highlight
6
citations

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

CVPR 2025highlightarXiv:2411.08753
3
citations

WINS: Winograd Structured Pruning for Fast Winograd Convolution

Cheonjun Park, Hyunjae Oh, Mincheol Park et al.

ICCV 2025highlight

WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2025highlight

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Zizhang Li, Hong-Xing Yu, Wei Liu et al.

ICCV 2025highlight

WonderWorld: Interactive 3D Scene Generation from a Single Image

Hong-Xing Yu, Haoyi Duan, Charles Herrmann et al.

CVPR 2025highlight
120
citations

World-consistent Video Diffusion with Explicit 3D Modeling

Qihang Zhang, Shuangfei Zhai, Miguel Ángel Bautista et al.

CVPR 2025highlight

X-Dancer: Expressive Music to Human Dance Video Generation

Zeyuan Chen, Hongyi Xu, Guoxian Song et al.

ICCV 2025highlight
9
citations

X-Dyna: Expressive Dynamic Human Image Animation

Di Chang, Hongyi Xu, You Xie et al.

CVPR 2025highlightarXiv:2501.10021
14
citations

XLRS-Bench: Could Your Multimodal LLMs Understand Extremely Large Ultra-High-Resolution Remote Sensing Imagery?

Fengxiang Wang, hongzhen wang, Zonghao Guo et al.

CVPR 2025highlight
24
citations

Your Large Vision-Language Model Only Needs A Few Attention Heads For Visual Grounding

seil kang, Jinyeong Kim, Junhyeok Kim et al.

CVPR 2025highlightarXiv:2503.06287
31
citations

Your ViT is Secretly an Image Segmentation Model

Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans et al.

CVPR 2025highlight
24
citations

You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale

Baorui Ma, Huachen Gao, Haoge Deng et al.

CVPR 2025highlightarXiv:2412.06699
49
citations