Bilal Piot
6
Papers
60
Total Citations
Papers (6)
RRM: Robust Reward Model Training Mitigates Reward Hacking
ICLR 2025arXiv
44
citations
Unlocking the Power of Representations in Long-term Novelty-based Exploration
ICLR 2024
9
citations
Learning from negative feedback, or positive feedback or both
ICLR 2025arXiv
7
citations
Nash Learning from Human Feedback
ICML 2024
0
citations
Generalized Preference Optimization: A Unified Approach to Offline Alignment
ICML 2024
0
citations
Human Alignment of Large Language Models through Online Preference Optimisation
ICML 2024
0
citations