Sangho Lee
6
Papers
136
Total Citations
Papers (6)
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
96
citations
One Diffusion to Generate Them All
CVPR 2025
34
citations
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
ECCV 2024
6
citations
ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams
CVPR 2025
0
citations
MAMS: Model-Agnostic Module Selection Framework for Video Captioning
AAAI 2025
0
citations
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action
CVPR 2024
0
citations