"multimodal video understanding" Papers

4 papers found

Filters:multimodal video understanding Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

ConViS-Bench: Estimating Video Similarity Through Semantic Concepts

Benedetta Liberatori, Alessandro Conti, Lorenzo Vaquero et al.

NeurIPS 2025posterarXiv:2509.19245

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

Xuehai He, Weixi Feng, Kaizhi Zheng et al.

ICLR 2025posterarXiv:2406.08407

PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling

Xiao Yu, Yan Fang, Yao Zhao et al.

NeurIPS 2025oralarXiv:2505.23155

Exploiting Auxiliary Caption for Video Grounding

Hongxiang Li, Meng Cao, Xuxin Cheng et al.

AAAI 2024paperarXiv:2301.05997