NeurIPS 2025 by Zuwei Long Papers
2 papers found
VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
Chaoyou Fu, Haojia Lin, Xiong Wang et al.
NeurIPS 2025spotlightarXiv:2501.01957
130
citations
VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model
Zuwei Long, Yunhang Shen, Chaoyou Fu et al.
NeurIPS 2025poster