Kristen Grauman

15

Papers

146

Total Citations

Papers (15)

PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding

NeurIPS 2025arXiv

Learning Object State Changes in Videos: An Open-World Perspective

Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos

SoundingActions: Learning How Actions Sound from Narrated Egocentric Videos

ExpertAF: Expert Actionable Feedback from Video

Learning Spatial Features from Audio-Visual Correspondence in Egocentric Videos

Progress-Aware Video Frame Captioning

Detours for Navigating Instructional Videos

When Thinking Drifts: Evidential Grounding for Robust Video Reasoning

FIction: 4D Future Interaction Prediction from Video

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Learning Skill-Attributes for Transferable Assessment in Video