2025 Poster "video large language models" Papers
14 papers found
ARGUS: Hallucination and Omission Evaluation in Video-LLMs
Ruchit Rawal, Reza Shirkavand, Heng Huang et al.
ICCV 2025posterarXiv:2506.07371
3
citations
Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models
Eunseop Yoon, Hee Suk Yoon, Mark Hasegawa-Johnson et al.
ICLR 2025posterarXiv:2507.04976
4
citations
DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO
Jinyoung Park, Jeehye Na, Jinyoung Kim et al.
NEURIPS 2025posterarXiv:2506.07464
23
citations
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Rui Qian, Shuangrui Ding, Xiaoyi Dong et al.
CVPR 2025posterarXiv:2501.03218
31
citations
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale
Joya Chen, Yiqi Lin, Ziyun Zeng et al.
CVPR 2025posterarXiv:2504.16030
4
citations
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Tiantian Geng, Jinrui Zhang, Qingni Wang et al.
CVPR 2025posterarXiv:2411.19772
32
citations
MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models
Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.
NEURIPS 2025posterarXiv:2502.16671
7
citations
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
Jeongseok Hyun, Sukjun Hwang, Su Ho Han et al.
ICCV 2025posterarXiv:2507.07990
12
citations
PAVE: Patching and Adapting Video Large Language Models
Zhuoming Liu, Yiquan Li, Khoi D Nguyen et al.
CVPR 2025posterarXiv:2503.19794
1
citations
PVChat: Personalized Video Chat with One-Shot Learning
YUFEI SHI, Weilong Yan, Gang Xu et al.
ICCV 2025posterarXiv:2503.17069
3
citations
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Andong Deng, Zhongpai Gao, Anwesa Choudhuri et al.
CVPR 2025posterarXiv:2411.16932
6
citations
VidEmo: Affective-Tree Reasoning for Emotion-Centric Video Foundation Models
Zhicheng Zhang, Weicheng Wang, Yongjie Zhu et al.
NEURIPS 2025posterarXiv:2511.02712
VideoOrion: Tokenizing Object Dynamics in Videos
Yicheng Feng, Yijiang Li, Wanpeng Zhang et al.
ICCV 2025posterarXiv:2411.16156
8
citations
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Yuqian Yuan, Hang Zhang, Wentong Li et al.
CVPR 2025posterarXiv:2501.00599
41
citations