Trevor Darrell
33
Papers
1,069
Total Citations
Papers (33)
Sequential Modeling Enables Scalable Learning for Large Vision Models
CVPR 2024
230
citations
Compositional Chain-of-Thought Prompting for Large Multimodal Models
CVPR 2024
167
citations
Navigation World Models
CVPR 2025arXiv
136
citations
Self-correcting LLM-controlled Diffusion Models
CVPR 2024
95
citations
LLM-grounded Video Diffusion Models
ICLR 2024
76
citations
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
CVPR 2024
71
citations
When Do We Not Need Larger Vision Models?
ECCV 2024
70
citations
Describing Differences in Image Sets with Natural Language
CVPR 2024
51
citations
Describe Anything: Detailed Localized Image and Video Captioning
ICCV 2025
49
citations
VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation
CVPR 2024
36
citations
Pre-training Auto-regressive Robotic Models with 4D Representations
ICML 2025
19
citations
PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor
CVPR 2024
17
citations
VisionArena: 230k Real World User-VLM Conversations with Preference Labels
CVPR 2025
12
citations
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling
NeurIPS 2025arXiv
10
citations
Recursive Visual Programming
ECCV 2024
10
citations
Vision-Language Models Create Cross-Modal Task Representations
ICML 2025
7
citations
Dual-Process Image Generation
ICCV 2025
6
citations
LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery
NeurIPS 2025
4
citations
Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features
ICCV 2025
3
citations
Stochastic positional embeddings improve masked image modeling
ICML 2024
0
citations
Scaling Vision Pre-Training to 4K Resolution
CVPR 2025
0
citations
Visual Lexicon: Rich Image Features in Language Space
CVPR 2025
0
citations
Pose Priors from Language Models
CVPR 2025
0
citations
AutoPresent: Designing Structured Visuals from Scratch
CVPR 2025
0
citations
St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World
ICCV 2025
0
citations
Discovering Divergent Representations between Text-to-Image Models
ICCV 2025
0
citations
InstanceDiffusion: Instance-level Control for Image Generation
CVPR 2024
0
citations
See Say and Segment: Teaching LMMs to Overcome False Premises
CVPR 2024
0
citations
Unsupervised Universal Image Segmentation
CVPR 2024
0
citations
Readout Guidance: Learning Control from Diffusion Features
CVPR 2024
0
citations
Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
ICML 2024
0
citations
xT: Nested Tokenization for Larger Context in Large Images
ICML 2024
0
citations
Position: Near to Mid-term Risks and Opportunities of Open-Source Generative AI
ICML 2024
0
citations