Anca Dragan
25
Papers
84
Total Citations
Papers (25)
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
ICLR 2025
41
citations
Learning Optimal Advantage from Preferences and Mistaking It for Reward
AAAI 2024arXiv
15
citations
Context Steering: Controllable Personalization at Inference Time
ICLR 2025
11
citations
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
ICLR 2025
8
citations
The Effective Horizon Explains Deep RL Performance in Stochastic Environments
ICLR 2024
5
citations
AssistanceZero: Scalably Solving Assistance Games
ICML 2025
4
citations
Learning to Model the World With Language
ICML 2024
0
citations
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
ICML 2024
0
citations
Cooperative Inverse Reinforcement Learning
NeurIPS 2016arXiv
0
citations
Inverse Reward Design
NeurIPS 2017arXiv
0
citations
AI Alignment with Changing and Influenceable Reward Functions
ICML 2024
0
citations
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
ICML 2024
0
citations
On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference
ICML 2019
0
citations
Learning a Prior over Intent via Meta-Inverse Reinforcement Learning
ICML 2019
0
citations
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior
NeurIPS 2018
0
citations
On the Utility of Learning about Humans for Human-AI Coordination
NeurIPS 2019
0
citations
Reward-rational (implicit) choice: A unifying formalism for reward learning
NeurIPS 2020
0
citations
AvE: Assistance via Empowerment
NeurIPS 2020
0
citations
Preference learning along multiple criteria: A game-theoretic perspective
NeurIPS 2020
0
citations
Pragmatic Image Compression for Human-in-the-Loop Decision-Making
NeurIPS 2021
0
citations
First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization
NeurIPS 2022
0
citations
Uni[MASK]: Unified Inference in Sequential Decision Problems
NeurIPS 2022
0
citations
Learning to Influence Human Behavior with Offline Reinforcement Learning
NeurIPS 2023
0
citations
Bridging RL Theory and Practice with the Effective Horizon
NeurIPS 2023
0
citations
An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning
ICML 2018
0
citations