Xinhao Li
5
Papers
440
Total Citations
Papers (5)
VideoMamba: State Space Model for Efficient Video Understanding
ECCV 2024
396
citations
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
CVPR 2025arXiv
19
citations
VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception
NeurIPS 2025arXiv
13
citations
Online Video Understanding: OVBench and VideoChat-Online
CVPR 2025arXiv
9
citations
StreamForest: Efficient Online Video Understanding with Persistent Event Memory
NeurIPS 2025
3
citations