NEURIPS "vision language models" Papers
10 papers found
Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization
kaiyuan Li, Xiaoyue Chen, Chen Gao et al.
NEURIPS 2025posterarXiv:2505.22038
4
citations
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.
NEURIPS 2025oralarXiv:2506.03517
11
citations
GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning
Haolong Yan, Yeqing Shen, Xin Huang et al.
NEURIPS 2025posterarXiv:2512.02423
Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search
Yuta Oshima, Masahiro Suzuki, Yutaka Matsuo et al.
NEURIPS 2025posterarXiv:2501.19252
20
citations
MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation
Bohan Zhou, Yi Zhan, Zhongbin Zhang et al.
NEURIPS 2025oralarXiv:2505.16602
3
citations
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
Yicheng Xiao, Lin Song, Yukang Chen et al.
NEURIPS 2025posterarXiv:2505.13031
18
citations
MuSLR: Multimodal Symbolic Logical Reasoning
Jundong Xu, Hao Fei, Yuhui Zhang et al.
NEURIPS 2025posterarXiv:2509.25851
OOD-Barrier: Build a Middle-Barrier for Open-Set Single-Image Test Time Adaptation via Vision Language Models
Boyang Peng, Sanqing Qu, Tianpei Zou et al.
NEURIPS 2025poster
SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes
Yifan Yang, Zhen Zhang, Rupak Vignesh Swaminathan et al.
NEURIPS 2025posterarXiv:2506.20990
1
citations
Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
Lehan He, Zeren Chen, Zhelun Shi et al.
NEURIPS 2025posterarXiv:2411.17265
3
citations