Trevor Darrell

33

Papers

1,069

Total Citations

Papers (33)

Sequential Modeling Enables Scalable Learning for Large Vision Models

Compositional Chain-of-Thought Prompting for Large Multimodal Models

Navigation World Models

Self-correcting LLM-controlled Diffusion Models

LLM-grounded Video Diffusion Models

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

When Do We Not Need Larger Vision Models?

Describing Differences in Image Sets with Natural Language

Describe Anything: Detailed Localized Image and Video Captioning

VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation

Pre-training Auto-regressive Robotic Models with 4D Representations

PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor

VisionArena: 230k Real World User-VLM Conversations with Preference Labels

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

NeurIPS 2025arXiv

Recursive Visual Programming

Vision-Language Models Create Cross-Modal Task Representations

Dual-Process Image Generation

LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery

Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features

Stochastic positional embeddings improve masked image modeling

Scaling Vision Pre-Training to 4K Resolution

Visual Lexicon: Rich Image Features in Language Space

Pose Priors from Language Models

AutoPresent: Designing Structured Visuals from Scratch

St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World

Discovering Divergent Representations between Text-to-Image Models

InstanceDiffusion: Instance-level Control for Image Generation

See Say and Segment: Teaching LMMs to Overcome False Premises

Unsupervised Universal Image Segmentation

Readout Guidance: Learning Control from Diffusion Features

Hyperbolic Active Learning for Semantic Segmentation under Domain Shift

xT: Nested Tokenization for Larger Context in Large Images

Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI