2024 "video question answering" Papers

11 papers found

A Unified Image Compression Method for Human Perception and Multiple Vision Tasks

Sha Guo, Sui Lin, Chen-Lin Zhang et al.

ECCV 2024poster

BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind

Yuanyuan Mao, Xin Lin, Qin Ni et al.

AAAI 2024paperarXiv:2402.07402

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024poster
34
citations

LongVLM: Efficient Long Video Understanding via Large Language Models

Yuetian Weng, Mingfei Han, Haoyu He et al.

ECCV 2024posterarXiv:2404.03384
128
citations

MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling

Jiaqi Xu, Bo Liu, Yunkuo Chen et al.

AAAI 2024paperarXiv:2303.05707
2
citations

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024posterarXiv:2311.13627
36
citations

Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition

Hao Fei, Shengqiong Wu, Wei Ji et al.

ICML 2024oralarXiv:2501.03230

VideoPrism: A Foundational Visual Encoder for Video Understanding

Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.

ICML 2024posterarXiv:2402.13217

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024posterarXiv:2312.00937
37
citations

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models

Guangzhi Sun, Wenyi Yu, Changli Tang et al.

ICML 2024oralarXiv:2406.15704

YTCommentQA: Video Question Answerability in Instructional Videos

Saelyne Yang, Sunghyun Park, Yunseok Jang et al.

AAAI 2024paperarXiv:2401.17343