2025 Oral "audio-visual large language models" Papers
2 papers found
Aligned Better, Listen Better for Audio-Visual Large Language Models
Yuxin Guo, Shuailei Ma, Shijie Ma et al.
ICLR 2025oralarXiv:2504.02061
8
citations
SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing
Mingfei Chen, Zijun Cui, Xiulong Liu et al.
NeurIPS 2025oralarXiv:2506.05414
5
citations