Amanpreet Singh
12
Papers
212
Total Citations
Papers (12)
Generative Representational Instruction Tuning
ICLR 2025arXiv
212
citations
Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA
CVPR 2020arXiv
0
citations
TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text
CVPR 2021arXiv
0
citations
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality
CVPR 2022arXiv
0
citations
Unsupervised Vision-and-Language Pre-Training via Retrieval-Based Multi-Granular Alignment
CVPR 2022arXiv
0
citations
FLAVA: A Foundational Language and Vision Alignment Model
CVPR 2022arXiv
0
citations
UniT: Multimodal Multitask Learning With a Unified Transformer
ICCV 2021arXiv
0
citations
TextCaps: a Dataset for Image Captioning with Reading Comprehension
ECCV 2020
0
citations
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
ECCV 2020
0
citations
The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
NeurIPS 2020arXiv
0
citations
Human-Adversarial Visual Question Answering
NeurIPS 2021arXiv
0
citations
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents
NeurIPS 2023arXiv
0
citations