Kristen Grauman

86
Papers
106
Total Citations

Papers (86)

Learning Object State Changes in Videos: An Open-World Perspective

CVPR 2024
33
citations

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

ECCV 2024
19
citations

ExpertAF: Expert Actionable Feedback from Video

CVPR 2025
11
citations

SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

CVPR 2024
11
citations

Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos

CVPR 2024
8
citations

Detours for Navigating Instructional Videos

CVPR 2024
7
citations

Progress-Aware Video Frame Captioning

CVPR 2025
7
citations

When Thinking Drifts: Evidential Grounding for Robust Video Reasoning

NeurIPS 2025
4
citations

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos

CVPR 2025arXiv
3
citations

FIction: 4D Future Interaction Prediction from Video

CVPR 2025
3
citations

Seeing Invisible Poses: Estimating 3D Body Pose From Egocentric Video

CVPR 2017arXiv
0
citations

Detangling People: Individuating Multiple Close People and Their Body Parts via Region Assembly

CVPR 2017arXiv
0
citations

Making 360deg Video Watchable in 2D: Learning Videography for Click Free Viewing

CVPR 2017
0
citations

Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks

CVPR 2018arXiv
0
citations

Compare and Contrast: Learning Prominent Visual Differences

CVPR 2018arXiv
0
citations

VizWiz Grand Challenge: Answering Visual Questions From Blind People

CVPR 2018arXiv
0
citations

Im2Flow: Motion Hallucination From Static Images for Action Recognition

CVPR 2018arXiv
0
citations

Creating Capsule Wardrobes From Fashion Images

CVPR 2018arXiv
0
citations

Learning Compressible 360° Video Isomers

CVPR 2018
0
citations

BlockDrop: Dynamic Inference Paths in Residual Networks

CVPR 2018arXiv
0
citations

2.5D Visual Sound

CVPR 2019
0
citations

Thinking Outside the Pool: Active Training Image Creation for Relative Attributes

CVPR 2019
0
citations

Less Is More: Learning Highlight Detection From Video Duration

CVPR 2019
0
citations

Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion

CVPR 2019
0
citations

SpotTune: Transfer Learning Through Adaptive Fine-Tuning

CVPR 2019
0
citations

Kernel Transformer Networks for Compact Spherical Convolution

CVPR 2019
0
citations

Ego-Topo: Environment Affordances From Egocentric Video

CVPR 2020
0
citations

ViBE: Dressing for Diverse Body Shapes

CVPR 2020arXiv
0
citations

You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions

CVPR 2020arXiv
0
citations

Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias

CVPR 2020
0
citations

Listen to Look: Action Recognition by Previewing Audio

CVPR 2020arXiv
0
citations

From Paris to Berlin: Discovering Fashion Style Influences Around the World

CVPR 2020
0
citations

Ego-Exo: Transferring Visual Representations From Third-Person to First-Person Videos

CVPR 2021
0
citations

Semantic Audio-Visual Navigation

CVPR 2021
0
citations

Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

CVPR 2021arXiv
0
citations

VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency

CVPR 2021arXiv
0
citations

PONI: Potential Functions for ObjectGoal Navigation With Interaction-Free Learning

CVPR 2022
0
citations

Zero Experience Required: Plug & Play Modular Transfer Learning for Semantic Visual Navigation

CVPR 2022
0
citations

Visual Acoustic Matching

CVPR 2022arXiv
0
citations

Ego4D: Around the World in 3,000 Hours of Egocentric Video

CVPR 2022
0
citations

Chat2Map: Efficient Scene Mapping From Multi-Ego Conversations

CVPR 2023arXiv
0
citations

Novel-View Acoustic Synthesis

CVPR 2023arXiv
0
citations

NaQ: Leveraging Narrations As Queries To Supervise Episodic Memory

CVPR 2023arXiv
0
citations

HierVL: Learning Hierarchical Video-Language Embeddings

CVPR 2023arXiv
0
citations

Learning Image Representations Tied to Ego-Motion

ICCV 2015
0
citations

Just Noticeable Differences in Visual Attributes

ICCV 2015
0
citations

Fashion Forward: Forecasting Visual Style in Fashion

ICCV 2017arXiv
0
citations

On-Demand Learning for Deep Image Restoration

ICCV 2017arXiv
0
citations

Learning the Latent "Look": Unsupervised Discovery of a Style-Coherent Embedding From Fashion Images

ICCV 2017arXiv
0
citations

Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images

ICCV 2017arXiv
0
citations

Co-Separating Sounds of Visual Objects

ICCV 2019
0
citations

Fashion++: Minimal Edits for Outfit Improvement

ICCV 2019
0
citations

Grounded Human-Object Interaction Hotspots From Video

ICCV 2019
0
citations

From Culture to Clothing: Discovering the World Events Behind a Century of Fashion Images

ICCV 2021arXiv
0
citations

Move2Hear: Active Audio-Visual Source Separation

ICCV 2021
0
citations

Multiview Pseudo-Labeling for Semi-Supervised Learning From Video

ICCV 2021arXiv
0
citations

Audio-Visual Floorplan Reconstruction

ICCV 2021
0
citations

Anticipative Video Transformer

ICCV 2021arXiv
0
citations

Occupancy Anticipation for Efficient Exploration and Navigation

ECCV 2020
0
citations

SoundSpaces: Audio-Visual Navigation in 3D Environments

ECCV 2020
0
citations

VisualEchoes: Spatial Image Representation Learning through Echolocation

ECCV 2020
0
citations

Proposal-based Video Completion

ECCV 2020
0
citations

Egocentric Activity Recognition and Localization on a 3D Map

ECCV 2022
0
citations

Active Audio-Visual Separation of Dynamic Sound Sources

ECCV 2022
0
citations

Learning Spherical Convolution for Fast Features from 360° Imagery

NeurIPS 2017arXiv
0
citations

Egocentric Video Task Translation

CVPR 2023
0
citations

Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning

CVPR 2025
0
citations

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

ICCV 2025
0
citations

Learning Skill-Attributes for Transferable Assessment in Video

NeurIPS 2025
0
citations

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

CVPR 2024
0
citations

Pull the Plug? Predicting If Computers or Humans Should Segment Images

CVPR 2016
0
citations

Summary Transfer: Exemplar-Based Subset Selection for Video Summarization

CVPR 2016
0
citations

Active Image Segmentation Propagation

CVPR 2016
0
citations

Slow and Steady Feature Analysis: Higher Order Temporal Coherence in Video

CVPR 2016
0
citations

FusionSeg: Learning to Combine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Videos

CVPR 2017arXiv
0
citations

Learning Affordance Landscapes for Interaction Exploration in 3D Environments

NeurIPS 2020
0
citations

Shaping embodied agent behavior with activity-context priors from egocentric video

NeurIPS 2021
0
citations

Few-Shot Audio-Visual Learning of Environment Acoustics

NeurIPS 2022
0
citations

SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning

NeurIPS 2022
0
citations

Single-Stage Visual Query Localization in Egocentric Videos

NeurIPS 2023
0
citations

Self-Supervised Visual Acoustic Matching

NeurIPS 2023
0
citations

EgoDistill: Egocentric Head Motion Distillation for Efficient Video Understanding

NeurIPS 2023
0
citations

Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment

NeurIPS 2023
0
citations

EgoEnv: Human-centric environment representations from egocentric video

NeurIPS 2023
0
citations

Video-Mined Task Graphs for Keystep Recognition in Instructional Videos

NeurIPS 2023
0
citations

EgoTracks: A Long-term Egocentric Visual Object Tracking Dataset

NeurIPS 2023
0
citations