2025 Papers
21,856 papers found • Page 427 of 438
VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?
Xize Cheng, Ruofan Hu, Xiaoda Yang et al.
VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data
Jian Shi, Peter Wonka
VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction
Ziyue Zhu, Shenlong Wang, Jin Xie et al.
Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo-Labeling
Haoran Li, Xingjian Li, Jiahua Shi et al.
Voyaging into Perpetual Dynamic Scenes from a Single View
Fengrui Tian, Tianjiao Ding, Jinqi Luo et al.
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.
VPO: Reasoning Preferences Optimization Based on $\mathcal{V}$-Usable Information
Zecheng Wang, Chunshan Li, Yupeng Zhang et al.
VPR-Cloak: A First Look at Privacy Cloak Against Visual Place Recognition
Shuting Dong, Mingzhi Chen, Feng Lu et al.
VProChart: Answering Chart Question Through Visual Perception Alignment Agent and Programmatic Solution Reasoning
Muye Huang, Lingling Zhang, Han Lai et al.
VQ4DiT: Efficient Post-Training Vector Quantization for Diffusion Transformers
Juncan Deng, Shuaiting Li, Zeyu Wang et al.
VQA4CIR: Boosting Composed Image Retrieval with Visual Question Answering
Chun-Mei Feng, Yang Bai, Tao Luo et al.
VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
Sicheng Yang, Zhaohu Xing, Lei Zhu
VQ-SGen: A Vector Quantized Stroke Representation for Creative Sketch Generation
Jiawei Wang, Zhiming Cui, Changjian Li
VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization
Tao Liu, Ziyang Ma, Qi Chen et al.
VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
Haichao Zhang, Yun Fu
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
Yating Wang, Haoyi Zhu, Mingyu Liu et al.
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Qiuchen Wang, Ruixue Ding, Yu Zeng et al.
VR as a ``Drop-In'' Well-being Tool for Knowledge Workers
Sophia Ppali, Haris Psallidopoulos, Marios Constantinides et al.
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
Jiashuo Yu, Yue Wu, Meng Chu et al.
VR-Drive: Viewpoint-Robust End-to-End Driving with Feed-Forward 3D Gaussian Splatting
Hoonhee Cho, Jae-Young Kang, Giwon Lee et al.
VRM: Knowledge Distillation via Virtual Relation Matching
Weijia Zhang, Fei Xie, Weidong Cai et al.
VR Onboarding Procedures for Multiple Collocated Users: See-Through Tutorials and Group Transitions
Ephraim Schott, Tony Jan Zoeppig, Pramoch Viriyathomrongul et al.
VRtalk: Real-time Interactive Intelligent Anime Avatars in Virtual Reality
Yuan Yu, Chunlei Xu, Shirao Yang et al.
VRTennis: Forehand Training in Virtual Reality with Rule-Based Motion Analysis and Multimodal Feedback
Anna Sebernegg, Peter Kán
VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression
Qiang Hu, Houqiang Zhong, Zihan Zheng et al.
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.
VSNet: Focusing on the Linguistic Characteristics of Sign Language
Yuhao Li, Xinyue Chen, Hongkai Li et al.
VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs
Qiucheng Wu, Handong Zhao, Michael Saxon et al.
VSRM: A Robust Mamba-Based Framework for Video Super-Resolution
Phu Tran Dinh, Hung Dao, Daeyoung Kim
VSSD: Vision Mamba with Non-Causal State Space Duality
Yuheng Shi, Mingjia Li, Minjing Dong et al.
VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis
Yumeng Li, William H Beluch, Margret Keuper et al.
V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Qingtao Liu, Yu Cui, Zhengnan Sun et al.
VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
Wenhao Li, Qiangchang Wang, Xianjing Meng et al.
VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians
Pengchong Hu, Zhizhong Han
VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
Yongxin Guo, Jingyu Liu, Mingda Li et al.
VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning
Jinglei Zhang, Yuanfan Guo, Rolandos Alexandros Potamias et al.
VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction
Zijian He, Yuwei Ning, Yipeng Qin et al.
VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding
Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.
VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences
Siqi Wan, Jingwen Chen, Qi Cai et al.
Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning
Liang CHEN, Xueting Han, Li Shen et al.
Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection
Dat NGUYEN, Marcella Astrid, Anis Kacem et al.
Vulnerable Data-Aware Adversarial Training
Yuqi Feng, Jiahao Fan, Yanan Sun
VVC-Gym: A Fixed-Wing UAV Reinforcement Learning Environment for Multi-Goal Long-Horizon Problems
Xudong Gong, Feng Dawei, Kele Xu et al.
VVRec: Reconstruction Attacks on DL-based Volumetric Video Upstreaming via Latent Diffusion Model with Gamma Distribution
Rui Lu, Bihai Zhang, Dan Wang
Wait-Less Offline Tuning and Re-solving for Online Decision Making
Jingruo Sun, Wenzhi Gao, Ellen Vitercik et al.
Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation
Ziying Li, Xuequan Lu, Xinkui Zhao et al.
Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning
Xiaoyu Yang, Jie Lu, En Yu
Walking the Web of Concept-Class Relationships in Incrementally Trained Interpretable Models
Susmit Agrawal, Deepika Vemuri, Sri Siddarth Chakaravarthy P et al.
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Katie Matton, Robert Ness, John Guttag et al.