Zhibo Yang

5

Papers

70

Total Citations

Papers (5)

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers

DocThinker: Explainable Multimodal Large Language Models with Rule-based Reinforcement Learning for Document Understanding

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition