2025 Paper "vision-language integration" Papers
2 papers found
Crafting Dynamic Virtual Activities with Advanced Multimodal Models
Changyang Li, Qingan Yan, Minyoung Kim et al.
ISMAR 2025paperarXiv:2406.17582
Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information
Yi Chen, Jian Xu, Xu-Yao Zhang et al.
AAAI 2025paperarXiv:2409.01179
15
citations