Sergey Levine

27
Papers
538
Total Citations

Papers (27)

OGBench: Benchmarking Offline Goal-Conditioned RL

ICLR 2025arXiv
74
citations

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

ICLR 2024
68
citations

Scaling Test-Time Compute Without Verification or RL is Suboptimal

ICML 2025
68
citations

LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models

ICML 2025
63
citations

Flow Q-Learning

ICML 2025
52
citations

Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better

NeurIPS 2025arXiv
46
citations

Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

ICLR 2025
40
citations

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

ICLR 2025arXiv
27
citations

RLIF: Interactive Imitation Learning as Reinforcement Learning

ICLR 2024
25
citations

Language Guided Skill Discovery

ICLR 2025arXiv
14
citations

Adding Conditional Control to Diffusion Models with Reinforcement Learning

ICLR 2025arXiv
13
citations

Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design

ICML 2025
12
citations

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data

ICLR 2024
11
citations

Prioritized Generative Replay

ICLR 2025
9
citations

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

ICLR 2025
8
citations

Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration

ICML 2025
7
citations

Behavioral Exploration: Learning to Explore via In-Context Adaptation

ICML 2025
1
citations

Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models

ICML 2024
0
citations

Feedback Efficient Online Fine-Tuning of Diffusion Models

ICML 2024
0
citations

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

ICML 2024
0
citations

Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making

ICML 2024
0
citations

Learning to Explore in POMDPs with Informational Rewards

ICML 2024
0
citations

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

ICML 2024
0
citations

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

ICML 2024
0
citations

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

ICML 2024
0
citations

Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings

ICML 2024
0
citations

Foundation Policies with Hilbert Representations

ICML 2024
0
citations