2025 Papers

21,856 papers found • Page 427 of 438

VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?

Xize Cheng, Ruofan Hu, Xiaoda Yang et al.

ICLR 2025poster

VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data

Jian Shi, Peter Wonka

ICCV 2025poster

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025posterarXiv:2506.05563

Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling

Haoran Li, Xingjian Li, Jiahua Shi et al.

AAAI 2025paperarXiv:2406.18610

Voyaging into Perpetual Dynamic Scenes from a Single View

Fengrui Tian, Tianjiao Ding, Jinqi Luo et al.

ICCV 2025posterarXiv:2507.04183

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.

ICCV 2025posterarXiv:2503.20491
13
citations

VPO: Reasoning Preferences Optimization Based on $\mathcal{V}$-Usable Information

Zecheng Wang, Chunshan Li, Yupeng Zhang et al.

NeurIPS 2025spotlight

VPR-Cloak: A First Look at Privacy Cloak Against Visual Place Recognition

Shuting Dong, Mingzhi Chen, Feng Lu et al.

ICCV 2025poster

VProChart: Answering Chart Question Through Visual Perception Alignment Agent and Programmatic Solution Reasoning

Muye Huang, Lingling Zhang, Han Lai et al.

AAAI 2025paperarXiv:2409.01667
5
citations

VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers

Juncan Deng, Shuaiting Li, Zeyu Wang et al.

AAAI 2025paperarXiv:2408.17131
11
citations

VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering

Chun-Mei Feng, Yang Bai, Tao Luo et al.

AAAI 2025paperarXiv:2312.12273

VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation

Sicheng Yang, Zhaohu Xing, Lei Zhu

NeurIPS 2025posterarXiv:2601.10124

VQ-SGen: A Vector Quantized Stroke Representation for Creative Sketch Generation

Jiawei Wang, Zhiming Cui, Changjian Li

ICCV 2025posterarXiv:2411.16446
1
citations

VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization

Tao Liu, Ziyang Ma, Qi Chen et al.

AAAI 2025paperarXiv:2412.09892

VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models

Haichao Zhang, Yun Fu

NeurIPS 2025oralarXiv:2503.16980
3
citations

VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers

Yating Wang, Haoyi Zhu, Mingyu Liu et al.

ICCV 2025posterarXiv:2507.01016
16
citations

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Qiuchen Wang, Ruixue Ding, Yu Zeng et al.

NeurIPS 2025poster

VR as a ``Drop-In'' Well-being Tool for Knowledge Workers

Sophia Ppali, Haris Psallidopoulos, Marios Constantinides et al.

ISMAR 2025paperarXiv:2510.02836

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Jiashuo Yu, Yue Wu, Meng Chu et al.

ICCV 2025posterarXiv:2506.10857
8
citations

VR-Drive: Viewpoint-Robust End-to-End Driving with Feed-Forward 3D Gaussian Splatting

Hoonhee Cho, Jae-Young Kang, Giwon Lee et al.

NeurIPS 2025oralarXiv:2510.23205

VRM: Knowledge Distillation via Virtual Relation Matching

Weijia Zhang, Fei Xie, Weidong Cai et al.

ICCV 2025highlightarXiv:2502.20760

VR Onboarding Procedures for Multiple Collocated Users: See-Through Tutorials and Group Transitions

Ephraim Schott, Tony Jan Zoeppig, Pramoch Viriyathomrongul et al.

ISMAR 2025paper

VRtalk: Real-time Interactive Intelligent Anime Avatars in Virtual Reality

Yuan Yu, Chunlei Xu, Shirao Yang et al.

ISMAR 2025paper

VRTennis: Forehand Training in Virtual Reality with Rule-Based Motion Analysis and Multimodal Feedback

Anna Sebernegg, Peter Kán

ISMAR 2025paper

VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression

Qiang Hu, Houqiang Zhong, Zihan Zheng et al.

AAAI 2025paperarXiv:2412.11362

VSC: Visual Search Compositional Text-to-Image Diffusion Model

Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.

ICCV 2025posterarXiv:2505.01104
2
citations

VSNet: Focusing on the Linguistic Characteristics of Sign Language

Yuhao Li, Xinyue Chen, Hongkai Li et al.

CVPR 2025poster
1
citations

VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs

Qiucheng Wu, Handong Zhao, Michael Saxon et al.

ICCV 2025poster
18
citations

VSRM: A Robust Mamba-Based Framework for Video Super-Resolution

Phu Tran Dinh, Hung Dao, Daeyoung Kim

ICCV 2025posterarXiv:2506.22762
1
citations

VSSD: Vision Mamba with Non-Causal State Space Duality

Yuheng Shi, Mingjia Li, Minjing Dong et al.

ICCV 2025posterarXiv:2407.18559
24
citations

VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis

Yumeng Li, William H Beluch, Margret Keuper et al.

ICLR 2025oralarXiv:2403.13501
10
citations

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.

CVPR 2025posterarXiv:2503.12077
5
citations

VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning

Qingtao Liu, Yu Cui, Zhengnan Sun et al.

ICLR 2025poster
11
citations

VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning

Wenhao Li, Qiangchang Wang, Xianjing Meng et al.

NeurIPS 2025posterarXiv:2509.25033
2
citations

VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians

Pengchong Hu, Zhizhong Han

ICML 2025posterarXiv:2506.02741

VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding

Yongxin Guo, Jingyu Liu, Mingda Li et al.

AAAI 2025paperarXiv:2405.13382

VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning

Jinglei Zhang, Yuanfan Guo, Rolandos Alexandros Potamias et al.

ICCV 2025posterarXiv:2510.14672

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction

Zijian He, Yuwei Ning, Yipeng Qin et al.

CVPR 2025posterarXiv:2503.12165
10
citations

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.

CVPR 2025posterarXiv:2408.12340
10
citations

VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences

Siqi Wan, Jingwen Chen, Qi Cai et al.

NeurIPS 2025poster

Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning

Liang CHEN, Xueting Han, Li Shen et al.

ICML 2025posterarXiv:2506.03850
6
citations

Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection

Dat NGUYEN, Marcella Astrid, Anis Kacem et al.

ICCV 2025posterarXiv:2501.01184

Vulnerable Data-Aware Adversarial Training

Yuqi Feng, Jiahao Fan, Yanan Sun

NeurIPS 2025poster

VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems

Xudong Gong, Feng Dawei, Kele Xu et al.

ICLR 2025oral

VVRec: Reconstruction Attacks on DL-based Volumetric Video Upstreaming via Latent Diffusion Model with Gamma Distribution

Rui Lu, Bihai Zhang, Dan Wang

AAAI 2025paperarXiv:2502.17880

Wait-Less Offline Tuning and Re-solving for Online Decision Making

Jingruo Sun, Wenzhi Gao, Ellen Vitercik et al.

ICML 2025posterarXiv:2412.09594

Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation

Ziying Li, Xuequan Lu, Xinkui Zhao et al.

NeurIPS 2025posterarXiv:2511.05609
1
citations

Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning

Xiaoyu Yang, Jie Lu, En Yu

NeurIPS 2025oral

Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models

Susmit Agrawal, Deepika Vemuri, Sri Siddarth Chakaravarthy P et al.

AAAI 2025paperarXiv:2502.20393

Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

Katie Matton, Robert Ness, John Guttag et al.

ICLR 2025posterarXiv:2504.14150