Poster "direct policy optimization" Papers
2 papers found
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matching
Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother et al.
ICLR 2025posterarXiv:2411.07007
6
citations
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.
ICML 2024poster