Xiangru Jian
3
Papers
6
Total Citations
Papers (3)
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
ICLR 2025arXiv
5
citations
The Underappreciated Power of Vision Models for Graph Structural Understanding
NeurIPS 2025
1
citations
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Document Understanding
NeurIPS 2025arXiv
0
citations