NeurIPS 2025 Papers
5,858 papers found • Page 115 of 118
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang, Chao Qu, Zuming Huang et al.
VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set
Shufan Shen, Junshu Sun, Qingming Huang et al.
VL-SAM-V2: Open-World Object Detection with General and Specific Query Fusion
Zhiwei Lin, Yongtao Wang
VMDT: Decoding the Trustworthiness of Video Foundation Models
Yujin Potter, Zhun Wang, Nicholas Crispino et al.
Vocabulary-Guided Gait Recognition
Panjian Huang, Saihui Hou, Chunshui Cao et al.
Vocabulary In-Context Learning in Transformers: Benefits of Positional Encoding
Qian Ma, Ruoxiang Xu, Yongqiang Cai
VolleyBots: A Testbed for Multi-Drone Volleyball Game Combining Motion Control and Strategic Play
Zelai Xu, Ruize Zhang, Chao Yu et al.
Volume Transmission Implements Context Factorization to Target Online Credit Assignment and Enable Compositional Generalization
Matthew Bull, Po-Chen Kuo, Andrew Smith et al.
VORTA: Efficient Video Diffusion via Routing Sparse Attention
Wenhao Sun, Rong-Cheng Tu, Yifu Ding et al.
VoxDet: Rethinking 3D Semantic Scene Completion as Dense Object Detection
Wuyang Li, Zhu Yu, Alexandre Alahi
VPO: Reasoning Preferences Optimization Based on $\mathcal{V}$-Usable Information
Zecheng Wang, Chunshan Li, Yupeng Zhang et al.
VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation
Sicheng Yang, Zhaohu Xing, Lei Zhu
VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models
Haichao Zhang, Yun Fu
VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning
Qiuchen Wang, Ruixue Ding, Yu Zeng et al.
VR-Drive: Viewpoint-Robust End-to-End Driving with Feed-Forward 3D Gaussian Splatting
Hoonhee Cho, Jae-Young Kang, Giwon Lee et al.
VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
Wenhao Li, Qiangchang Wang, Xianjing Meng et al.
VTON-VLLM: Aligning Virtual Try-On Models with Human Preferences
Siqi Wan, Jingwen Chen, Qi Cai et al.
Vulnerable Data-Aware Adversarial Training
Yuqi Feng, Jiahao Fan, Yanan Sun
Walking the Schrödinger Bridge: A Direct Trajectory for Text-to-3D Generation
Ziying Li, Xuequan Lu, Xinkui Zhao et al.
Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning
Xiaoyu Yang, Jie Lu, En Yu
WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Siyu Zhou, Tianyi Zhou, Yijun Yang et al.
WaLRUS: Wavelets for Long range Representation Using State Space Methods
Hossein Babaei, Mel White, Sina Alemohammad et al.
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance
Ruihang Chu, Yefei He, Zhekai Chen et al.
WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting
Kaitao Huang, Yan Yan, Jing-Hao Xue et al.
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov, Arman Zharmagambetov, Aaron Grattafiori et al.
Wasserstein Convergence of Critically Damped Langevin Diffusions
Stanislas Strasman, Sobihan Surendran, Claire Boyer et al.
Wasserstein Transfer Learning
Kaicheng Zhang, Sinian Zhang, Doudou Zhou et al.
Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
Zinuo Li, Xian Zhang, Yongxin Guo et al.
Watermarking Autoregressive Image Generation
Nikola Jovanović, Ismail Labiad, Tomas Soucek et al.
WaveAR: Wavelet-Aware Continuous Autoregressive Diffusion for Accurate Human Motion Prediction
shengchuan gao, Shuo Wang, Yabiao Wang et al.
Wavelet Canonical Coherence for Nonstationary Signals
Haibo Wu, Marina Knight, Keiland Cooper et al.
Wavy Transformer
Satoshi Noguchi, Yoshinobu Kawahara
Weak-shot Keypoint Estimation via Keyness and Correspondence Transfer
Junjie Chen, Zeyu Luo, Zezheng Liu et al.
Weak-to-Strong Generalization under Distribution Shifts
Myeongho Jeon, Jan Sobotka, Suhwan Choi et al.
WearVQA: A Visual Question Answering Benchmark for Wearables in Egocentric Authentic Real-world scenarios
Eun Chang, Zhuangqun Huang, Yiwei Liao et al.
WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization
Jiahao Wen, Hang Yu, Zhedong Zheng
Weaver: Shrinking the Generation-Verification Gap by Scaling Compute for Verification
Jon Saad-Falcon, Estefany Kelly Buchanan, Mayee Chen et al.
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.
WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch
Zimu Lu, Yunqiao Yang, Houxing Ren et al.
Web-Scale Collection of Video Data for 4D Animal Reconstruction
Brian Nlong Zhao, Jiajun Wu, Shangzhe Wu
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents
Hyungjoo Chae, Seonghwan Kim, Junhee Cho et al.
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Xiaoxi Li, Jiajie Jin, Guanting Dong et al.
We Should Chart an Atlas of All the World's Models
Eliahu Horwitz, Nitzan Kurer, Jonathan Kahana et al.
What are you sinking? A geometric approach on attention sink
Valeria Ruscio, Umberto Nanni, Fabrizio Silvestri
What Can RL Bring to VLA Generalization? An Empirical Study
Jijia Liu, Feng Gao, Bingwen Wei et al.
What Data Enables Optimal Decisions? An Exact Characterization for Linear Optimization
Omar Bennouna, Amine Bennouna, Saurabh Amin et al.
What Does It Take to Build a Performant Selective Classifier?
Stephan Rabanser, Nicolas Papernot
What Do Latent Action Models Actually Learn?
Chuheng Zhang, Tim Pearce, Pushi Zhang et al.
What do you know? Bayesian knowledge inference for navigating agents
Matthias Schultheis, Jana-Sophie Schönfeld, Constantin Rothkopf et al.
What Expressivity Theory Misses: Message Passing Complexity for GNNs
Niklas Kemper, Tom Wollschläger, Stephan Günnemann