Longteng Guo

6

Papers

37

Total Citations

Papers (6)

EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE

Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs

Breaking the Encoder Barrier for Seamless Video-Language Understanding

Efficient Motion-Aware Video MLLM

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation