NEURIPS "vision language models" Papers

10 papers found

Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization

kaiyuan Li, Xiaoyue Chen, Chen Gao et al.

NEURIPS 2025posterarXiv:2505.22038
4
citations

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.

NEURIPS 2025oralarXiv:2506.03517
11
citations

GUI Exploration Lab: Enhancing Screen Navigation in Agents via Multi-Turn Reinforcement Learning

Haolong Yan, Yeqing Shen, Xin Huang et al.

NEURIPS 2025posterarXiv:2512.02423

Inference-Time Text-to-Video Alignment with Diffusion Latent Beam Search

Yuta Oshima, Masahiro Suzuki, Yutaka Matsuo et al.

NEURIPS 2025posterarXiv:2501.19252
20
citations

MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation

Bohan Zhou, Yi Zhan, Zhongbin Zhang et al.

NEURIPS 2025oralarXiv:2505.16602
3
citations

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Yicheng Xiao, Lin Song, Yukang Chen et al.

NEURIPS 2025posterarXiv:2505.13031
18
citations

MuSLR: Multimodal Symbolic Logical Reasoning

Jundong Xu, Hao Fei, Yuhui Zhang et al.

NEURIPS 2025posterarXiv:2509.25851

OOD-Barrier: Build a Middle-Barrier for Open-Set Single-Image Test Time Adaptation via Vision Language Models

Boyang Peng, Sanqing Qu, Tianpei Zou et al.

NEURIPS 2025poster

SharpZO: Hybrid Sharpness-Aware Vision Language Model Prompt Tuning via Forward-Only Passes

Yifan Yang, Zhen Zhang, Rupak Vignesh Swaminathan et al.

NEURIPS 2025posterarXiv:2506.20990
1
citations

Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

Lehan He, Zeren Chen, Zhelun Shi et al.

NEURIPS 2025posterarXiv:2411.17265
3
citations