2025 "temporal grounding" Papers
9 papers found
EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering
Sheng Zhou, Junbin Xiao, Qingyun Li et al.
CVPR 2025posterarXiv:2502.07411
29
citations
Factorized Learning for Temporally Grounded Video-Language Models
Wenzheng Zeng, Difei Gao, Mike Zheng Shou et al.
ICCV 2025posterarXiv:2512.24097
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.
NEURIPS 2025oralarXiv:2506.07016
5
citations
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
Tanveer Hannan, Md Mohaiminul Islam, Jindong Gu et al.
CVPR 2025posterarXiv:2411.14901
9
citations
SEAL: Semantic Attention Learning for Long Video Representation
Lan Wang, Yujia Chen, Wen-Sheng Chu et al.
CVPR 2025posterarXiv:2412.01798
7
citations
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Andong Deng, Zhongpai Gao, Anwesa Choudhuri et al.
CVPR 2025posterarXiv:2411.16932
6
citations
TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision
Ayush Gupta, Anirban Roy, Rama Chellappa et al.
ICCV 2025posterarXiv:2506.09445
Tracking and Understanding Object Transformations
Yihong Sun, Xinyu Yang, Jennifer Sun et al.
NEURIPS 2025oralarXiv:2511.04678
Youku Dense Caption: A Large-scale Chinese Video Dense Caption Dataset and Benchmarks
Zixuan Xiong, Guangwei Xu, wenkai zhang et al.
ICLR 2025poster