by Haozhe Zhao Papers
4 papers found
A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation
Liang Chen, Sinan Tan, Zefan Cai et al.
ICLR 2025posterarXiv:2410.01912
7
citations
NEP: Autoregressive Image Editing via Next Editing Token Prediction
Huimin Wu, Xiaojian (Shawn) Ma, Haozhe Zhao et al.
NeurIPS 2025poster
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Liang Chen, Haozhe Zhao, Tianyu Liu et al.
ECCV 2024posterarXiv:2403.06764
343
citations
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao, Zefan Cai, Shuzheng Si et al.
ICLR 2024poster