Online-to-Offline RL for Agent Alignment

0citations
0
Citations
#2184
in ICLR 2025
of 3827 papers
5
Authors
3
Data Points

Abstract

Reinforcement learning (RL) has shown remarkable success in training agents to achieve high-performing policies, particularly in domains like Game AI where simulation environments enable efficient interactions. However, despite their success in maximizing these returns, such online-trained policies often fail to align with human preferences concerning actions, styles, and values. The challenge lies in efficiently adapting these online-trained policies to align with human preferences, given the scarcity and high cost of collecting human behavior data. In this work, we formalize the problem asonline-to-offlineRL and propose ALIGNment of Game AI to Preferences (ALIGN-GAP), an innovative approach for the alignment of well-trained game agents to human preferences. Our method features a carefully designed reward model that encodes human preferences from limited offline data and incorporates curriculum-based preference learning to align RL agents with targeted human preferences. Experiments across diverse environments and preference types demonstrate the performance of ALIGN-GAP, achieving effective alignment with human preferences.

Citation History

Jan 26, 2026
0
Jan 27, 2026
0
Jan 27, 2026
0