2025 Poster Papers
15,759 papers found • Page 308 of 316
VMDT: Decoding the Trustworthiness of Video Foundation Models
Yujin Potter, Zhun Wang, Nicholas Crispino et al.
Vocabulary-Guided Gait Recognition
Panjian Huang, Saihui Hou, Chunshui Cao et al.
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
Qian Ma, Ruoxiang Xu, Yongqiang Cai
VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
Yash Garg, Saketh Bachu, Arindam Dutta et al.
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye, Yukang Gan, Xiaoke Huang et al.
VODiff: Controlling Object Visibility Order in Text-to-Image Generation
Dong Liang, Jinyuan Jia, Yuhao Liu et al.
VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models
Kim Sung-Bin, Jeongsoo Choi, Puyuan Peng et al.
VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning
Nilay Yilmaz, Maitreya Patel, Lawrence Luo et al.
VolFormer: Explore More Comprehensive Cube Interaction for Hyperspectral Image Restoration and Beyond
Dabing Yu, Zheng Gao
VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play
Zelai Xu, Ruize Zhang, Chao Yu et al.
VoluMe – Authentic 3D Video Calls from Live Gaussian Splat Prediction
Martin de La Gorce, Charlie Hewitt, Tibor Takács et al.
Volume-Aware Distance for Robust Similarity Learning
Shuo Chen, Chen Gong, Jun Li et al.
Volume Optimality in Conformal Prediction with Structured Prediction Sets
Chao Gao, Liren Shan, Vaidehi Srinivas et al.
Volume Transmission Implements Context Factorization to Target Online Credit Assignment and Enable Compositional Generalization
Matthew Bull, Po-Chen Kuo, Andrew Smith et al.
Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes
Stefano Esposito, Anpei Chen, Christian Reiser et al.
Voronoi-grid-based Pareto Front Learning and Its Application to Collaborative Federated Learning
Mengmeng Chen, Xiaohu Wu, QIQI LIU et al.
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.
VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow
Yancong Lin, Shiming Wang, Liangliang Nan et al.
VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding
Minchao Jiang, Shunyu Jia, Jiaming Gu et al.
VOVTrack: Exploring the Potentiality in Raw Videos for Open-Vocabulary Multi-Object Tracking
Zekun Qian, Ruize Han, Junhui Hou et al.
VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?
Xize Cheng, Ruofan Hu, Xiaoda Yang et al.
VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data
Jian Shi, Peter Wonka
VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction
Ziyue Zhu, Shenlong Wang, Jin Xie et al.
Voyaging into Perpetual Dynamic Scenes from a Single View
Fengrui Tian, Tianjiao Ding, Jinqi Luo et al.
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.
VPR-Cloak: A First Look at Privacy Cloak Against Visual Place Recognition
Shuting Dong, Mingzhi Chen, Feng Lu et al.
VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
Sicheng Yang, Zhaohu Xing, Lei Zhu
VQ-SGen: A Vector Quantized Stroke Representation for Creative Sketch Generation
Jiawei Wang, Zhiming Cui, Changjian Li
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
Yating Wang, Haoyi Zhu, Mingyu Liu et al.
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Qiuchen Wang, Ruixue Ding, Yu Zeng et al.
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
Jiashuo Yu, Yue Wu, Meng Chu et al.
VSC: Visual Search Compositional Text-to-Image Diffusion Model
Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.
VSNet: Focusing on the Linguistic Characteristics of Sign Language
Yuhao Li, Xinyue Chen, Hongkai Li et al.
VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs
Qiucheng Wu, Handong Zhao, Michael Saxon et al.
VSRM: A Robust Mamba-Based Framework for Video Super-Resolution
Phu Tran Dinh, Hung Dao, Daeyoung Kim
VSSD: Vision Mamba with Non-Causal State Space Duality
Yuheng Shi, Mingjia Li, Minjing Dong et al.
V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.
VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning
Qingtao Liu, Yu Cui, Zhengnan Sun et al.
VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
Wenhao Li, Qiangchang Wang, Xianjing Meng et al.
VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians
Pengchong Hu, Zhizhong Han
VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning
Jinglei Zhang, Yuanfan Guo, Rolandos Alexandros Potamias et al.
VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction
Zijian He, Yuwei Ning, Yipeng Qin et al.
VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding
Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.
VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences
Siqi Wan, Jingwen Chen, Qi Cai et al.
Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning
Liang CHEN, Xueting Han, Li Shen et al.
Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection
Dat NGUYEN, Marcella Astrid, Anis Kacem et al.
Vulnerable Data-Aware Adversarial Training
Yuqi Feng, Jiahao Fan, Yanan Sun
Wait-Less Offline Tuning and Re-solving for Online Decision Making
Jingruo Sun, Wenzhi Gao, Ellen Vitercik et al.
Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation
Ziying Li, Xuequan Lu, Xinkui Zhao et al.
Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations
Katie Matton, Robert Ness, John Guttag et al.