"best-of-n sampling" Papers
3 papers found
BOND: Aligning LLMs with Best-of-N Distillation
Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot-Desenonges et al.
ICLR 2025posterarXiv:2407.14622
50
citations
Inference-Time Reward Hacking in Large Language Models
Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling et al.
NeurIPS 2025spotlightarXiv:2506.19248
2
citations
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Shufan Li, Konstantinos Kallidromitis, Akash Gokul et al.
ICCV 2025posterarXiv:2503.12271
21
citations