Nino Vieillard

5

Papers

57

Total Citations

Papers (5)

BOND: Aligning LLMs with Best-of-N Distillation

Loss Functions and Operators Generated by f-Divergences

WARM: On the Benefits of Weight Averaged Reward Models

Munchausen Reinforcement Learning

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning