Xinhao Li

5

Papers

440

Total Citations

Papers (5)

VideoMamba: State Space Model for Efficient Video Understanding

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

NeurIPS 2025arXiv

Online Video Understanding: OVBench and VideoChat-Online

StreamForest: Efficient Online Video Understanding with Persistent Event Memory