Li Fei-Fei

68
Papers
595
Total Citations

Papers (68)

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

CVPR 2025
342
citations

Learning Semantic Relationships for Better Action Retrieval in Images

CVPR 2015
114
citations

ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image

CVPR 2024
85
citations

Re-thinking Temporal Search for Long-Form Video Understanding

CVPR 2025
36
citations

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

CVPR 2024
14
citations

Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation

ICCV 2025
4
citations

Image Retrieval Using Scene Graphs

CVPR 2015
0
citations

Fine-Grained Recognition Without Part Annotations

CVPR 2015
0
citations

Social LSTM: Human Trajectory Prediction in Crowded Spaces

CVPR 2016
0
citations

Recurrent Attention Models for Depth-Based Person Identification

CVPR 2016
0
citations

End-To-End Learning of Action Detection From Frame Glimpses in Videos

CVPR 2016
0
citations

Detecting Events and Key Actors in Multi-Person Videos

CVPR 2016
0
citations

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

CVPR 2016
0
citations

Visual7W: Grounded Question Answering in Images

CVPR 2016
0
citations

A Hierarchical Approach for Generating Descriptive Image Paragraphs

CVPR 2017arXiv
0
citations

Knowledge Acquisition for Visual Question Answering via Iterative Querying

CVPR 2017
0
citations

Jointly Learning Energy Expenditures and Activities Using Egocentric Multimodal Signals

CVPR 2017
0
citations

Unsupervised Visual-Linguistic Reference Resolution in Instructional Videos

CVPR 2017arXiv
0
citations

Unsupervised Learning of Long-Term Motion Dynamics for Videos

CVPR 2017arXiv
0
citations

CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning

CVPR 2017arXiv
0
citations

Learning to Learn From Noisy Web Videos

CVPR 2017arXiv
0
citations

Scene Graph Generation by Iterative Message Passing

CVPR 2017arXiv
0
citations

Image Generation From Scene Graphs

CVPR 2018arXiv
0
citations

Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks

CVPR 2018arXiv
0
citations

Finding "It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

CVPR 2018
0
citations

Referring Relationships

CVPR 2018arXiv
0
citations

Iterative Visual Reasoning Beyond Convolutions

CVPR 2018arXiv
0
citations

What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets

CVPR 2018
0
citations

Thoracic Disease Identification and Localization With Limited Supervision

CVPR 2018arXiv
0
citations

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

CVPR 2019
0
citations

Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks

CVPR 2019
0
citations

Information Maximizing Visual Question Generation

CVPR 2019
0
citations

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

CVPR 2019
0
citations

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation

CVPR 2019
0
citations

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos

CVPR 2019
0
citations

Composing Text and Image for Image Retrieval - an Empirical Odyssey

CVPR 2019
0
citations

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration

CVPR 2019
0
citations

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs

CVPR 2020
0
citations

Greedy Hierarchical Variational Autoencoders for Large-Scale Video Prediction

CVPR 2021
0
citations

Metadata Normalization

CVPR 2021arXiv
0
citations

Scalable Differential Privacy With Sparse Network Finetuning

CVPR 2021
0
citations

ObjectFolder 2.0: A Multisensory Object Dataset for Sim2Real Transfer

CVPR 2022
0
citations

Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning

CVPR 2022arXiv
0
citations

Revisiting the "Video" in Video-Language Understanding

CVPR 2022
0
citations

The ObjectFolder Benchmark: Multisensory Learning With Neural and Real Objects

CVPR 2023
0
citations

The Language of Motion: Unifying Verbal and Non-verbal Language of 3D Human Motion

CVPR 2025
0
citations

RGB-W: When Vision Meets Wireless

ICCV 2015
0
citations

Learning Temporal Embeddings for Complex Video Analysis

ICCV 2015
0
citations

Love Thy Neighbors: Image Annotation by Exploiting Image Metadata

ICCV 2015
0
citations

Visual Semantic Planning Using Deep Successor Representations

ICCV 2017arXiv
0
citations

Dense-Captioning Events in Videos

ICCV 2017arXiv
0
citations

Fine-Grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach

ICCV 2017arXiv
0
citations

Inferring and Executing Programs for Visual Reasoning

ICCV 2017arXiv
0
citations

Characterizing and Improving Stability in Neural Style Transfer

ICCV 2017arXiv
0
citations

Scene Graph Prediction With Limited Labels

ICCV 2019
0
citations

Situational Fusion of Visual Representation for Visual Navigation

ICCV 2019
0
citations

Rendering Humans from Object-Occluded Monocular Videos

ICCV 2023arXiv
0
citations

Procedure Planning in Instructional Videos

ECCV 2020
0
citations

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition

ECCV 2020
0
citations

PrivHAR: Recognizing Human Actions from Privacy-Preserving Lens

ECCV 2022
0
citations

Improving Image Classification With Location Context

ICCV 2015
0
citations

Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization

ICCV 2025
0
citations

WorldScore: Unified Evaluation Benchmark for World Generation

ICCV 2025
0
citations

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

ICML 2024
0
citations

Best of Both Worlds: Human-Machine Collaboration for Object Annotation

CVPR 2015
0
citations

Deep Visual-Semantic Alignments for Generating Image Descriptions

CVPR 2015
0
citations

MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels

ICML 2018
0
citations

Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go?

ICML 2018
0
citations