NEURIPS 2025 "reward optimization" Papers
3 papers found
Alignment of Large Language Models with Constrained Learning
Botong Zhang, Shuo Li, Ignacio Hounie et al.
NEURIPS 2025posterarXiv:2505.19387
2
citations
Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference
Stephen Zhao, Aidan Li, Rob Brekelmans et al.
NEURIPS 2025posterarXiv:2510.21184
Understanding Data Influence in Reinforcement Finetuning
Haoru Tan, Xiuzhe Wu, Sitong Wu et al.
NEURIPS 2025oral