Kevin Qinghong Lin
14
Papers
239
Total Citations
Papers (14)
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
CVPR 2025
123
citations
VideoLLM-online: Online Video Large Language Model for Streaming Video
CVPR 2024
109
citations
ROICtrl: Boosting Instance Control for Visual Generation
CVPR 2025
7
citations
Bootstrapping SparseFormers from Vision Foundation Models
CVPR 2024
0
citations
All in One: Exploring Unified Video-Language Pre-Training
CVPR 2023arXiv
0
citations
Affordance Grounding From Demonstration Video To Target Image
CVPR 2023arXiv
0
citations
Too Large; Data Reduction for Vision-Language Pre-Training
ICCV 2023arXiv
0
citations
UniVTG: Towards Unified Video-Language Temporal Grounding
ICCV 2023arXiv
0
citations
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
ICCV 2023arXiv
0
citations
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary
CVPR 2025
0
citations
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
CVPR 2025
0
citations
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting
AAAI 2025
0
citations
Egocentric Video-Language Pretraining
NeurIPS 2022
0
citations
Learning Visual Prior via Generative Pre-Training
NeurIPS 2023
0
citations