Shengbang Tong
8
Papers
628
Total Citations
Papers (8)
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
CVPR 2024
570
citations
Scaling Language-Free Visual Representation Learning
ICCV 2025arXiv
39
citations
Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models
ICLR 2024
19
citations
Unsupervised Manifold Linearizing and Clustering
ICCV 2023arXiv
0
citations
MetaMorph: Multimodal Understanding and Generation via Instruction Tuning
ICCV 2025
0
citations
Revisiting Sparse Convolutional Model for Visual Recognition
NeurIPS 2022
0
citations
White-Box Transformers via Sparse Rate Reduction
NeurIPS 2023
0
citations
Mass-Producing Failures of Multimodal Systems with Language Models
NeurIPS 2023
0
citations