Zhengyang Liang

5

Papers

268

Total Citations

Papers (5)

Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding

MLVU: Benchmarking Multi-task Long Video Understanding

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly

Self-Supervised Multi-Modal Knowledge Graph Contrastive Hashing for Cross-Modal Search

MomentSeeker: A Task-Oriented Benchmark For Long-Video Moment Retrieval

NeurIPS 2025arXiv