ICCV Poster "video question answering" Papers
9 papers found
Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM
Han Wang, Yuxiang Nie, Yongjie Ye et al.
ICCV 2025posterarXiv:2412.09530
15
citations
How Can Objects Help Video-Language Understanding?
Zitian Tang, Shijie Wang, Junho Cho et al.
ICCV 2025posterarXiv:2504.07454
3
citations
Learning Streaming Video Representation via Multitask Training
Yibin Yan, Jilan Xu, Shangzhe Di et al.
ICCV 2025posterarXiv:2504.20041
3
citations
Multi-Granular Spatio-Temporal Token Merging for Training-Free Acceleration of Video LLMs
Jeongseok Hyun, Sukjun Hwang, Su Ho Han et al.
ICCV 2025posterarXiv:2507.07990
12
citations
TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision
Ayush Gupta, Anirban Roy, Rama Chellappa et al.
ICCV 2025posterarXiv:2506.09445
VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges
Yuxuan Wang, Yiqi Song, Cihang Xie et al.
ICCV 2025posterarXiv:2409.01071
3
citations
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Shijie Zhou, Alexander Vilesov, Xuehai He et al.
ICCV 2025posterarXiv:2508.02095
15
citations
VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos
Jiashuo Yu, Yue Wu, Meng Chu et al.
ICCV 2025posterarXiv:2506.10857
9
citations
VTimeCoT: Thinking by Drawing for Video Temporal Grounding and Reasoning
Jinglei Zhang, Yuanfan Guo, Rolandos Alexandros Potamias et al.
ICCV 2025posterarXiv:2510.14672
2
citations