Sergey Levine
27
Papers
538
Total Citations
Papers (27)
OGBench: Benchmarking Offline Goal-Conditioned RL
ICLR 2025arXiv
74
citations
METRA: Scalable Unsupervised RL with Metric-Aware Abstraction
ICLR 2024
68
citations
Scaling Test-Time Compute Without Verification or RL is Suboptimal
ICML 2025
68
citations
LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language Models
ICML 2025
63
citations
Flow Q-Learning
ICML 2025
52
citations
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better
NeurIPS 2025arXiv
46
citations
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
ICLR 2025
40
citations
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
ICLR 2025arXiv
27
citations
RLIF: Interactive Imitation Learning as Reinforcement Learning
ICLR 2024
25
citations
Language Guided Skill Discovery
ICLR 2025arXiv
14
citations
Adding Conditional Control to Diffusion Models with Reinforcement Learning
ICLR 2025arXiv
13
citations
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design
ICML 2025
12
citations
Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data
ICLR 2024
11
citations
Prioritized Generative Replay
ICLR 2025
9
citations
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
ICLR 2025
8
citations
Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration
ICML 2025
7
citations
Behavioral Exploration: Learning to Explore via In-Context Adaptation
ICML 2025
1
citations
Prompting is a Double-Edged Sword: Improving Worst-Group Robustness of Foundation Models
ICML 2024
0
citations
Feedback Efficient Online Fine-Tuning of Diffusion Models
ICML 2024
0
citations
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
ICML 2024
0
citations
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
ICML 2024
0
citations
Learning to Explore in POMDPs with Informational Rewards
ICML 2024
0
citations
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
ICML 2024
0
citations
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
ICML 2024
0
citations
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
ICML 2024
0
citations
Unsupervised Zero-Shot Reinforcement Learning via Functional Reward Encodings
ICML 2024
0
citations
Foundation Policies with Hilbert Representations
ICML 2024
0
citations