Lin Song
5
Papers
24
Total Citations
Papers (5)
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
NeurIPS 2025arXiv
18
citations
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding
ICML 2025
6
citations
YOLO-World: Real-Time Open-Vocabulary Object Detection
CVPR 2024
0
citations
Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs
CVPR 2024
0
citations
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition
CVPR 2024
0
citations