Micah Carroll
4
Papers
774
Total Citations
1
Affiliations
Affiliations
UC Berkeley
Papers (4)
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
ICLR 2025arXiv
733
citations
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
ICLR 2025arXiv
41
citations
AI Alignment with Changing and Influenceable Reward Functions
ICML 2024arXiv
0
citations
Uni[MASK]: Unified Inference in Sequential Decision Problems
NeurIPS 2022
0
citations