Zhihang Liu
3
Papers
57
Total Citations
Papers (3)
Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval
AAAI 2024arXiv
40
citations
Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models
CVPR 2025arXiv
14
citations
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
NeurIPS 2025arXiv
3
citations