Spotlight "vision-language models" Papers
15 papers found
Conference
Approximate Domain Unlearning for Vision-Language Models
Kodai Kawamura, Yuta Goto, Rintaro Yanagi et al.
NEURIPS 2025spotlightarXiv:2510.08132
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment
Bryan Sangwoo Kim, Jeongsol Kim, Jong Chul Ye
NEURIPS 2025spotlightarXiv:2505.18600
4
citations
Conditional Representation Learning for Customized Tasks
Honglin Liu, Chao Sun, Peng Hu et al.
NEURIPS 2025spotlightarXiv:2510.04564
1
citations
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
Hyungyung Lee, Geon Choi, Jung-Oh Lee et al.
NEURIPS 2025spotlightarXiv:2505.18087
3
citations
LaViDa: A Large Diffusion Model for Vision-Language Understanding
Shufan Li, Konstantinos Kallidromitis, Hritik Bansal et al.
NEURIPS 2025spotlight
OpenCUA: Open Foundations for Computer-Use Agents
Xinyuan Wang, Bowen Wang, Dunjie Lu et al.
NEURIPS 2025spotlightarXiv:2508.09123
37
citations
OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts
Shiting (Ginny) Xiao, Rishabh Kabra, Yuhang Li et al.
NEURIPS 2025spotlightarXiv:2507.05427
2
citations
QSVD: Efficient Low-rank Approximation for Unified Query-Key-Value Weight Compression in Low-Precision Vision-Language Models
Yutong Wang, Haiyu Wang, Sai Qian Zhang
NEURIPS 2025spotlightarXiv:2510.16292
1
citations
Robust SuperAlignment: Weak-to-Strong Robustness Generalization for Vision-Language Models
Junhao Dong, Cong Zhang, Xinghua Qu et al.
NEURIPS 2025spotlight
Spatial Understanding from Videos: Structured Prompts Meet Simulation Data
Haoyu Zhang, Meng Liu, Zaijing Li et al.
NEURIPS 2025spotlightarXiv:2506.03642
7
citations
Vision-centric Token Compression in Large Language Model
Ling Xing, Alex Jinpeng Wang, Rui Yan et al.
NEURIPS 2025spotlightarXiv:2502.00791
11
citations
Vision Transformers Don't Need Trained Registers
Nicholas Jiang, Amil Dravid, Alexei Efros et al.
NEURIPS 2025spotlightarXiv:2506.08010
15
citations
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning
Haozhe Wang, Chao Qu, Zuming Huang et al.
NEURIPS 2025spotlightarXiv:2504.08837
183
citations
Code as Reward: Empowering Reinforcement Learning with VLMs
David Venuto, Mohammad Sami Nur Islam, Martin Klissarov et al.
ICML 2024spotlightarXiv:2402.04764
27
citations
Realistic Unsupervised CLIP Fine-tuning with Universal Entropy Optimization
Jian Liang, Sheng, Zhengbo Wang et al.
ICML 2024spotlightarXiv:2308.12919
13
citations