Yale Song
5
Papers
41
Total Citations
Papers (5)
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
NeurIPS 2025arXiv
40
citations
Streaming VideoLLMs for Real-Time Procedural Video Understanding
ICCV 2025
1
citations
VITED: Video Temporal Evidence Distillation
CVPR 2025
0
citations
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
ICCV 2025
0
citations
Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives
CVPR 2024
0
citations