Tanmay Gupta
12
Papers
148
Total Citations
Papers (12)
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
96
citations
SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World
CVPR 2024
52
citations
Completing 3D Object Shape From One Depth Image
CVPR 2015
0
citations
Visual Semantic Role Labeling for Video Understanding
CVPR 2021arXiv
0
citations
Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture
CVPR 2022
0
citations
Visual Programming: Compositional Visual Reasoning Without Training
CVPR 2023arXiv
0
citations
Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks
ICCV 2017arXiv
0
citations
ViCo: Word Embeddings From Visual Co-Occurrences
ICCV 2019
0
citations
No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques
ICCV 2019
0
citations
Contrastive Learning for Weakly Supervised Phrase Grounding
ECCV 2020
0
citations
Webly Supervised Concept Expansion for General Purpose Vision Models
ECCV 2022
0
citations
OBJECT 3DIT: Language-guided 3D-aware Image Editing
NeurIPS 2023
0
citations