"video understanding benchmarks" Papers
4 papers found
CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Guo Chen, Yicheng Liu, Yifei Huang et al.
ICLR 2025posterarXiv:2412.12075
41
citations
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
Jiabo Ye, Haiyang Xu, Haowei Liu et al.
ICLR 2025posterarXiv:2408.04840
237
citations
RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video
ShuHang Xun, Sicheng Tao, Jungang Li et al.
NeurIPS 2025posterarXiv:2505.02064
5
citations
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin, Zhicheng Sun, Kun Xu et al.
ICML 2024oral