NEURIPS Poster "reward optimization" Papers
2 papers found
Alignment of Large Language Models with Constrained Learning
Botong Zhang, Shuo Li, Ignacio Hounie et al.
NEURIPS 2025posterarXiv:2505.19387
2
citations
Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference
Stephen Zhao, Aidan Li, Rob Brekelmans et al.
NEURIPS 2025posterarXiv:2510.21184