Poster "on-policy reinforcement learning" Papers
4 papers found
Absolute Policy Optimization: Enhancing Lower Probability Bound of Performance with High Confidence
Weiye Zhao, Feihan Li, Yifan Sun et al.
ICML 2024poster
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Fahim Tajwar, Anikait Singh, Archit Sharma et al.
ICML 2024poster
Reflective Policy Optimization
Yaozhong Gan, yan renye, zhe wu et al.
ICML 2024poster
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla, Ananye Agarwal, Deepak Pathak
ICML 2024poster