Zihao Yue
4
papers
57
total citations
papers (4)
Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding
NeurIPS 2025arXiv
42
citations
VideoOrion: Tokenizing Object Dynamics in Videos
ICCV 2025arXiv
8
citations
Unified Multimodal Understanding via Byte-Pair Visual Encoding
ICCV 2025
7
citations
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation
NeurIPS 2023arXiv
0
citations