Ali Farhadi
68
Papers
119
Total Citations
Papers (68)
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
CVPR 2025
96
citations
DRAWER: Digital Reconstruction and Articulation With Environment Realism
CVPR 2025
13
citations
Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
ICCV 2025
7
citations
Convergent Functions, Divergent Forms
NeurIPS 2025arXiv
3
citations
Discriminative and Consistent Similarities in Instance-Level Multiple Instance Learning
CVPR 2015
0
citations
VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases
CVPR 2015
0
citations
You Only Look Once: Unified, Real-Time Object Detection
CVPR 2016
0
citations
A Task-Oriented Approach for Cost-Sensitive Recognition
CVPR 2016
0
citations
Actions ~ Transformations
CVPR 2016
0
citations
Newtonian Scene Understanding: Unfolding the Dynamics of Objects in Static Images
CVPR 2016
0
citations
Situation Recognition: Visual Semantic Role Labeling for Image Understanding
CVPR 2016
0
citations
Asynchronous Temporal Fields for Action Recognition
CVPR 2017arXiv
0
citations
Are You Smarter Than a Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension
CVPR 2017
0
citations
Commonly Uncommon: Semantic Sparsity in Situation Recognition
CVPR 2017arXiv
0
citations
YOLO9000: Better, Faster, Stronger
CVPR 2017arXiv
0
citations
Structured Set Matching Networks for One-Shot Part Labeling
CVPR 2018arXiv
0
citations
Who Let the Dogs Out? Modeling Dog Behavior From Visual Data
CVPR 2018arXiv
0
citations
IQA: Visual Question Answering in Interactive Environments
CVPR 2018arXiv
0
citations
SeGAN: Segmenting and Generating the Invisible
CVPR 2018arXiv
0
citations
Actor and Observer: Joint Modeling of First and Third-Person Videos
CVPR 2018arXiv
0
citations
ELASTIC: Improving CNNs With Dynamic Scaling Policies
CVPR 2019
0
citations
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
CVPR 2019
0
citations
Two Body Problem: Collaborative Visual Task Completion
CVPR 2019
0
citations
From Recognition to Cognition: Visual Commonsense Reasoning
CVPR 2019
0
citations
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
CVPR 2019
0
citations
Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph
CVPR 2019
0
citations
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
CVPR 2020arXiv
0
citations
RoboTHOR: An Open Simulation-to-Real Embodied AI Platform
CVPR 2020arXiv
0
citations
Butterfly Transform: An Efficient FFT Based Neural Architecture Design
CVPR 2020arXiv
0
citations
What's Hidden in a Randomly Weighted Neural Network?
CVPR 2020
0
citations
Visual Reaction: Learning to Play Catch With Your Drone
CVPR 2020arXiv
0
citations
Pushing It Out of the Way: Interactive Visual Navigation
CVPR 2021arXiv
0
citations
Forward Compatible Training for Large-Scale Embedding Retrieval Systems
CVPR 2022arXiv
0
citations
MERLOT Reserve: Neural Script Knowledge Through Vision and Language and Sound
CVPR 2022arXiv
0
citations
Robust Fine-Tuning of Zero-Shot Models
CVPR 2022arXiv
0
citations
Objaverse: A Universe of Annotated 3D Objects
CVPR 2023arXiv
0
citations
Phone2Proc: Bringing Robust Robots Into Our Chaotic World
CVPR 2023arXiv
0
citations
Segment-Phrase Table for Semantic Segmentation, Visual Entailment and Paraphrasing
ICCV 2015
0
citations
Generating Notifications for Missing Actions: Don't Forget to Turn the Lights Off!
ICCV 2015
0
citations
Visual Semantic Planning Using Deep Successor Representations
ICCV 2017arXiv
0
citations
See the Glass Half Full: Reasoning About Liquid Containers, Their Volume and Content
ICCV 2017arXiv
0
citations
What Does a Platypus Look Like? Generating Customized Prompts for Zero-Shot Image Classification
ICCV 2023arXiv
0
citations
Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Reinforcement
ICCV 2023arXiv
0
citations
Grounded Situation Recognition
ECCV 2020
0
citations
A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks
ECCV 2020
0
citations
VisualCOMET: Reasoning about the Dynamic Context of a Still Image
ECCV 2020
0
citations
Break and Make: Interactive Structural Understanding Using LEGO Bricks
ECCV 2022
0
citations
Object Manipulation via Visual Target Localization
ECCV 2022
0
citations
Visalogy: Answering Visual Analogy Questions
NeurIPS 2015
0
citations
LCNN: Lookup-Based Convolutional Neural Network
CVPR 2017arXiv
0
citations
Synthetic Visual Genome
CVPR 2025
0
citations
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
CVPR 2025
0
citations
Contrastive Flow Matching
ICCV 2025
0
citations
Defending Against Neural Fake News
NeurIPS 2019
0
citations
Discovering Neural Wirings
NeurIPS 2019
0
citations
Supermasks in Superposition
NeurIPS 2020
0
citations
MERLOT: Multimodal Neural Script Knowledge Models
NeurIPS 2021
0
citations
LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes
NeurIPS 2021
0
citations
Patching open-vocabulary models by interpolating weights
NeurIPS 2022
0
citations
Matryoshka Representation Learning
NeurIPS 2022
0
citations
Stable and low-precision training for large-scale vision-language models
NeurIPS 2023
0
citations
Localized Symbolic Knowledge Distillation for Visual Commonsense Models
NeurIPS 2023
0
citations
DataComp: In search of the next generation of multimodal datasets
NeurIPS 2023
0
citations
Objaverse-XL: A Universe of 10M+ 3D Objects
NeurIPS 2023
0
citations
Neural Priming for Sample-Efficient Adaptation
NeurIPS 2023
0
citations
On the Connection between Pre-training Data Diversity and Fine-tuning Robustness
NeurIPS 2023
0
citations
AdANNS: A Framework for Adaptive Semantic Search
NeurIPS 2023
0
citations
Unsupervised Deep Embedding for Clustering Analysis
ICML 2016
0
citations