Oral "dynamic scene understanding" Papers
2 papers found
SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
Mingfei Chen, Zijun Cui, Xiulong Liu et al.
NeurIPS 2025oralarXiv:2506.05414
5
citations
DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent)
Zongxin Yang, Guikun Chen, Xiaodi Li et al.
ICML 2024oral