Oral "temporal modeling" Papers
3 papers found
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai, Enxin Song, Yilun Du et al.
ICLR 2025oralarXiv:2410.03051
102
citations
Kronecker Mask and Interpretive Prompts are Language-Action Video Learners
Jingyi Yang, Zitong YU, Nixiuming et al.
ICLR 2025oralarXiv:2502.03549
3
citations
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng, Kaixiong Gong, Bohao Li et al.
NeurIPS 2025oralarXiv:2503.21776
236
citations