Aniruddha Kembhavi

47
Papers
198
Total Citations

Papers (47)

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

CVPR 2025
96
citations

SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World

CVPR 2024
52
citations

One Diffusion to Generate Them All

CVPR 2025
34
citations

Iterated Learning Improves Compositionality in Large Vision-Language Models

CVPR 2024
16
citations

Holodeck: Language Guided Generation of 3D Embodied AI Environments

CVPR 2024
0
citations

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action

CVPR 2024
0
citations

Seeing the Unseen: Visual Common Sense for Semantic Placement

CVPR 2024
0
citations

Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension

CVPR 2017
0
citations

Structured Set Matching Networks for One-Shot Part Labeling

CVPR 2018arXiv
0
citations

IQA: Visual Question Answering in Interactive Environments

CVPR 2018arXiv
0
citations

Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering

CVPR 2018arXiv
0
citations

ELASTIC: Improving CNNs With Dynamic Scaling Policies

CVPR 2019
0
citations

Two Body Problem: Collaborative Visual Task Completion

CVPR 2019
0
citations

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

CVPR 2020arXiv
0
citations

What's Hidden in a Randomly Weighted Neural Network?

CVPR 2020
0
citations

ManipulaTHOR: A Framework for Visual Object Manipulation

CVPR 2021arXiv
0
citations

Visual Room Rearrangement

CVPR 2021arXiv
0
citations

Visual Semantic Role Labeling for Video Understanding

CVPR 2021arXiv
0
citations

Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture

CVPR 2022
0
citations

What Do Navigation Agents Learn About Their Environment?

CVPR 2022
0
citations

Simple but Effective: CLIP Embeddings for Embodied AI

CVPR 2022arXiv
0
citations

Visual Programming: Compositional Visual Reasoning Without Training

CVPR 2023arXiv
0
citations

EXCALIBUR: Encouraging and Evaluating Embodied Exploration

CVPR 2023
0
citations

Objaverse: A Universe of Annotated 3D Objects

CVPR 2023arXiv
0
citations

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

CVPR 2023arXiv
0
citations

RobustNav: Towards Benchmarking Robustness in Embodied Navigation

ICCV 2021arXiv
0
citations

Scene Graph Contrastive Learning for Embodied Navigation

ICCV 2023
0
citations

I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision

ICCV 2023
0
citations

SatlasPretrain: A Large-Scale Dataset for Remote Sensing Image Understanding

ICCV 2023arXiv
0
citations

Grounded Situation Recognition

ECCV 2020
0
citations

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

ECCV 2020
0
citations

Webly Supervised Concept Expansion for General Purpose Vision Models

ECCV 2022
0
citations

Object Manipulation via Visual Target Localization

ECCV 2022
0
citations

GridToPix: Training Embodied Agents With Minimal Supervision

ICCV 2021arXiv
0
citations

ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams

CVPR 2025
0
citations

Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation

CVPR 2025
0
citations

Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences

CVPR 2024
0
citations

Learning About Objects by Learning to Interact with Them

NeurIPS 2020
0
citations

Supermasks in Superposition

NeurIPS 2020
0
citations

Bridging the Imitation Gap by Adaptive Insubordination

NeurIPS 2021
0
citations

Container: Context Aggregation Networks

NeurIPS 2021
0
citations

🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

NeurIPS 2022
0
citations

Ask4Help: Learning to Leverage an Expert for Embodied Tasks

NeurIPS 2022
0
citations

OBJECT 3DIT: Language-guided 3D-aware Image Editing

NeurIPS 2023
0
citations

SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality

NeurIPS 2023
0
citations

Objaverse-XL: A Universe of 10M+ 3D Objects

NeurIPS 2023
0
citations

Neural Priming for Sample-Efficient Adaptation

NeurIPS 2023
0
citations