"vision-language modeling" Papers
2 papers found
FOCUS: Unified Vision-Language Modeling for Interactive Editing Driven by Referential Segmentation
Fan Yang, Yousong Zhu, Xin Li et al.
NeurIPS 2025posterarXiv:2506.16806
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale
Joya Chen, Yiqi Lin, Ziyun Zeng et al.
CVPR 2025posterarXiv:2504.16030
4
citations