Serena Yeung

20
Papers
72
Total Citations

Papers (20)

Describing Differences in Image Sets with Natural Language

CVPR 2024
51
citations

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation

CVPR 2025
21
citations

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

CVPR 2025
0
citations

Apollo: An Exploration of Video Understanding in Large Multimodal Models

CVPR 2025
0
citations

End-To-End Learning of Action Detection From Frame Glimpses in Videos

CVPR 2016
0
citations

Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals

CVPR 2017
0
citations

Learning to Learn From Noisy Web Videos

CVPR 2017arXiv
0
citations

Holistic 3D Human and Scene Mesh Estimation From Single View Images

CVPR 2021arXiv
0
citations

Unsupervised Discovery of the Long-Tail in Instance Segmentation Using Hierarchical Self-Supervision

CVPR 2021arXiv
0
citations

PROB: Probabilistic Objectness for Open World Object Detection

CVPR 2023arXiv
0
citations

NeMo: Learning 3D Neural Motion Fields From Multiple Video Instances of the Same Action

CVPR 2023
0
citations

GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition

ICCV 2021
0
citations

Generalizable Neural Fields as Partially Observed Neural Processes

ICCV 2023arXiv
0
citations

DARCNN: Domain Adaptive Region-Based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images

CVPR 2021arXiv
0
citations

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

CVPR 2025
0
citations

Capturing implicit hierarchical structure in 3D biomedical images with self-supervised hyperbolic representations

NeurIPS 2021
0
citations

Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning

NeurIPS 2022
0
citations

DataPerf: Benchmarks for Data-Centric AI Development

NeurIPS 2023
0
citations

INSPECT: A Multimodal Dataset for Patient Outcome Prediction of Pulmonary Embolisms

NeurIPS 2023
0
citations

LOVM: Language-Only Vision Model Selection

NeurIPS 2023
0
citations