R. Manmatha
17
Papers
68
Total Citations
Papers (17)
DocFormerv2: Local Features for Document Understanding
AAAI 2024arXiv
58
citations
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
ECCV 2024
6
citations
No Head Left Behind – Multi-Head Alignment Distillation for Transformers
AAAI 2024
4
citations
Deep Decision Network for Multi-Class Image Classification
CVPR 2016
0
citations
Compressed Video Action Recognition
CVPR 2018arXiv
0
citations
SCATTER: Selective Context Attentional Scene Text Recognizer
CVPR 2020arXiv
0
citations
Sequence-to-Sequence Contrastive Learning for Text Recognition
CVPR 2021arXiv
0
citations
LaTr: Layout-Aware Transformer for Scene-Text VQA
CVPR 2022arXiv
0
citations
Towards Weakly-Supervised Text Spotting Using a Multi-Task Transformer
CVPR 2022arXiv
0
citations
PolyFormer: Referring Image Segmentation As Sequential Polygon Generation
CVPR 2023arXiv
0
citations
Sampling Matters in Deep Embedding Learning
ICCV 2017arXiv
0
citations
DocFormer: End-to-End Transformer for Document Understanding
ICCV 2021arXiv
0
citations
DocTr: Document Transformer for Structured Information Extraction in Documents
ICCV 2023arXiv
0
citations
Scaling up Image Segmentation across Data and Tasks
CVPR 2025
0
citations
GLASS: Global to Local Attention for Scene-Text Spotting
ECCV 2022
0
citations
Scalable Enumeration of Trap Spaces in Boolean Networks via Answer Set Programming
AAAI 2024
0
citations
On the Scalability of Diffusion-based Text-to-Image Generation
CVPR 2024
0
citations