Ali Farhadi

68
Papers
119
Total Citations

Papers (68)

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

CVPR 2025
96
citations

DRAWER: Digital Reconstruction and Articulation With Environment Realism

CVPR 2025
13
citations

Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos

ICCV 2025
7
citations

Convergent Functions, Divergent Forms

NeurIPS 2025arXiv
3
citations

Discriminative and Consistent Similarities in Instance-Level Multiple Instance Learning

CVPR 2015
0
citations

VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases

CVPR 2015
0
citations

You Only Look Once: Unified, Real-Time Object Detection

CVPR 2016
0
citations

A Task-Oriented Approach for Cost-Sensitive Recognition

CVPR 2016
0
citations

Actions ~ Transformations

CVPR 2016
0
citations

Newtonian Scene Understanding: Unfolding the Dynamics of Objects in Static Images

CVPR 2016
0
citations

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

CVPR 2016
0
citations

Asynchronous Temporal Fields for Action Recognition

CVPR 2017arXiv
0
citations

Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension

CVPR 2017
0
citations

Commonly Uncommon: Semantic Sparsity in Situation Recognition

CVPR 2017arXiv
0
citations

YOLO9000: Better, Faster, Stronger

CVPR 2017arXiv
0
citations

Structured Set Matching Networks for One-Shot Part Labeling

CVPR 2018arXiv
0
citations

Who Let the Dogs Out? Modeling Dog Behavior From Visual Data

CVPR 2018arXiv
0
citations

IQA: Visual Question Answering in Interactive Environments

CVPR 2018arXiv
0
citations

SeGAN: Segmenting and Generating the Invisible

CVPR 2018arXiv
0
citations

Actor and Observer: Joint Modeling of First and Third-Person Videos

CVPR 2018arXiv
0
citations

ELASTIC: Improving CNNs With Dynamic Scaling Policies

CVPR 2019
0
citations

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

CVPR 2019
0
citations

Two Body Problem: Collaborative Visual Task Completion

CVPR 2019
0
citations

From Recognition to Cognition: Visual Commonsense Reasoning

CVPR 2019
0
citations

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

CVPR 2019
0
citations

Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph

CVPR 2019
0
citations

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

CVPR 2020arXiv
0
citations

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

CVPR 2020arXiv
0
citations

Butterfly Transform: An Efficient FFT Based Neural Architecture Design

CVPR 2020arXiv
0
citations

What's Hidden in a Randomly Weighted Neural Network?

CVPR 2020
0
citations

Visual Reaction: Learning to Play Catch With Your Drone

CVPR 2020arXiv
0
citations

Pushing It Out of the Way: Interactive Visual Navigation

CVPR 2021arXiv
0
citations

Forward Compatible Training for Large-Scale Embedding Retrieval Systems

CVPR 2022arXiv
0
citations

MERLOT Reserve: Neural Script Knowledge Through Vision and Language and Sound

CVPR 2022arXiv
0
citations

Robust Fine-Tuning of Zero-Shot Models

CVPR 2022arXiv
0
citations

Objaverse: A Universe of Annotated 3D Objects

CVPR 2023arXiv
0
citations

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

CVPR 2023arXiv
0
citations

Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing

ICCV 2015
0
citations

Generating Notifications for Missing Actions: Don't Forget to Turn the Lights Off!

ICCV 2015
0
citations

Visual Semantic Planning Using Deep Successor Representations

ICCV 2017arXiv
0
citations

See the Glass Half Full: Reasoning About Liquid Containers, Their Volume and Content

ICCV 2017arXiv
0
citations

What Does a Platypus Look Like? Generating Customized Prompts for Zero-Shot Image Classification

ICCV 2023arXiv
0
citations

Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement

ICCV 2023arXiv
0
citations

Grounded Situation Recognition

ECCV 2020
0
citations

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

ECCV 2020
0
citations

VisualCOMET: Reasoning about the Dynamic Context of a Still Image

ECCV 2020
0
citations

Break and Make: Interactive Structural Understanding Using LEGO Bricks

ECCV 2022
0
citations

Object Manipulation via Visual Target Localization

ECCV 2022
0
citations

Visalogy: Answering Visual Analogy Questions

NeurIPS 2015
0
citations

LCNN: Lookup-Based Convolutional Neural Network

CVPR 2017arXiv
0
citations

Synthetic Visual Genome

CVPR 2025
0
citations

Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation

CVPR 2025
0
citations

Contrastive Flow Matching

ICCV 2025
0
citations

Defending Against Neural Fake News

NeurIPS 2019
0
citations

Discovering Neural Wirings

NeurIPS 2019
0
citations

Supermasks in Superposition

NeurIPS 2020
0
citations

MERLOT: Multimodal Neural Script Knowledge Models

NeurIPS 2021
0
citations

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

NeurIPS 2021
0
citations

Patching open-vocabulary models by interpolating weights

NeurIPS 2022
0
citations

Matryoshka Representation Learning

NeurIPS 2022
0
citations

Stable and low-precision training for large-scale vision-language models

NeurIPS 2023
0
citations

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

NeurIPS 2023
0
citations

DataComp: In search of the next generation of multimodal datasets

NeurIPS 2023
0
citations

Objaverse-XL: A Universe of 10M+ 3D Objects

NeurIPS 2023
0
citations

Neural Priming for Sample-Efficient Adaptation

NeurIPS 2023
0
citations

On the Connection between Pre-training Data Diversity and Fine-tuning Robustness

NeurIPS 2023
0
citations

AdANNS: A Framework for Adaptive Semantic Search

NeurIPS 2023
0
citations

Unsupervised Deep Embedding for Clustering Analysis

ICML 2016
0
citations