Shuailei Ma
7
Papers
118
Total Citations
Papers (7)
Language-Image Pre-training with Long Captions
ECCV 2024
63
citations
LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences
CVPR 2025
25
citations
Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
CVPR 2025
18
citations
Aligned Better, Listen Better for Audio-Visual Large Language Models
ICLR 2025
8
citations
Learning Visual Generative Priors without Text
CVPR 2025
4
citations
Chains of Diffusion Models
ECCV 2024
0
citations
CrossMAE: Cross-Modality Masked Autoencoders for Region-Aware Audio-Visual Pre-Training
CVPR 2024
0
citations