Shusheng Yang
7
Papers
352
Total Citations
Papers (7)
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
CVPR 2025
342
citations
MobileInst: Video Instance Segmentation on the Mobile
AAAI 2024arXiv
10
citations
Temporally Efficient Vision Transformer for Video Instance Segmentation
CVPR 2022arXiv
0
citations
RILS: Masked Visual Reconstruction in Language Semantic Space
CVPR 2023arXiv
0
citations
Instances As Queries
ICCV 2021
0
citations
Crossover Learning for Fast Online Video Instance Segmentation
ICCV 2021arXiv
0
citations
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
ICCV 2023arXiv
0
citations