Zesen Cheng
8
Papers
80
Total Citations
Papers (8)
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025
40
citations
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
NeurIPS 2025
26
citations
Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
ECCV 2024
13
citations
Tune-Your-Style: Intensity-tunable 3D Style Transfer with Gaussian Splatting
ICCV 2025
1
citations
GraCo: Granularity-Controllable Interactive Segmentation
CVPR 2024
0
citations
Temporal-aware Query Routing for Real-time Video Instance Segmentation
ICCV 2025
0
citations
Aligning Instance Brownian Bridge with Texts for Open-Vocabulary Video Instance Segmentation
AAAI 2025
0
citations
Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy
CVPR 2025
0
citations