2025 Oral "large multimodal models" Papers
2 papers found
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.
NEURIPS 2025oralarXiv:2506.07016
5
citations
Seeing the Arrow of Time in Large Multimodal Models
Zihui (Sherry) Xue, Romy Luo, Kristen Grauman
NEURIPS 2025oralarXiv:2506.03340
5
citations