Johan Ferret

4

Papers

50

Total Citations

Papers (4)

BOND: Aligning LLMs with Best-of-N Distillation

WARM: On the Benefits of Weight Averaged Reward Models

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

There Is No Turning Back: A Self-Supervised Approach for Reversibility-Aware Reinforcement Learning