Rui Shao
8
Papers
47
Total Citations
Papers (8)
LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant
CVPR 2025
33
citations
FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
ICCV 2025
11
citations
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation
ICCV 2025
3
citations
LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge
CVPR 2024
0
citations
Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy
CVPR 2025
0
citations
RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models
ICML 2024
0
citations
Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
CVPR 2025
0
citations
Less is More: Empowering GUI Agent with Context-Aware Simplification
ICCV 2025
0
citations