2024 "video question answering" Papers
11 papers found
A Unified Image Compression Method for Human Perception and Multiple Vision Tasks
Sha Guo, Sui Lin, Chen-Lin Zhang et al.
ECCV 2024poster
BDIQA: A New Dataset for Video Question Answering to Explore Cognitive Reasoning through Theory of Mind
Yuanyuan Mao, Xin Lin, Qin Ni et al.
AAAI 2024paperarXiv:2402.07402
LingoQA: Video Question Answering for Autonomous Driving
Ana-Maria Marcu, Long Chen, Jan Hünermann et al.
ECCV 2024poster
34
citations
LongVLM: Efficient Long Video Understanding via Large Language Models
Yuetian Weng, Mingfei Han, Haoyu He et al.
ECCV 2024posterarXiv:2404.03384
128
citations
MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling
Jiaqi Xu, Bo Liu, Yunkuo Chen et al.
AAAI 2024paperarXiv:2303.05707
2
citations
Vamos: Versatile Action Models for Video Understanding
Shijie Wang, Qi Zhao, Minh Quan et al.
ECCV 2024posterarXiv:2311.13627
36
citations
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei, Shengqiong Wu, Wei Ji et al.
ICML 2024oralarXiv:2501.03230
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.
ICML 2024posterarXiv:2402.13217
Video Question Answering with Procedural Programs
Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.
ECCV 2024posterarXiv:2312.00937
37
citations
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Guangzhi Sun, Wenyi Yu, Changli Tang et al.
ICML 2024oralarXiv:2406.15704
YTCommentQA: Video Question Answerability in Instructional Videos
Saelyne Yang, Sunghyun Park, Yunseok Jang et al.
AAAI 2024paperarXiv:2401.17343