"policy regularization" Papers
3 papers found
Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking
Cassidy Laidlaw, Shivam Singhal, Anca Dragan
ICLR 2025posterarXiv:2403.03185
24
citations
Mitigating Reward Over-optimization in Direct Alignment Algorithms with Importance Sampling
Nguyen Phuc, Ngoc-Hieu Nguyen, Duy M. H. Nguyen et al.
NEURIPS 2025posterarXiv:2506.08681
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu, Yang Li, Yixing Lan et al.
ICML 2024posterarXiv:2405.19909