Size Wu
7
Papers
288
Total Citations
Papers (7)
OMG-Seg: Is One Model Good Enough For All Segmentation?
CVPR 2024
106
citations
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
ICLR 2024
104
citations
Harmonizing Visual Representations for Unified Multimodal Understanding and Generation
ICCV 2025
33
citations
CLIM: Contrastive Language-Image Mosaic for Region Representation
AAAI 2024arXiv
24
citations
F-LMM: Grounding Frozen Large Multimodal Models
CVPR 2025
21
citations
Aligning Bag of Regions for Open-Vocabulary Object Detection
CVPR 2023arXiv
0
citations
Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
ICCV 2021arXiv
0
citations