ICML Poster "language model alignment" Papers
11 papers found
BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback
Gaurav Pandey, Yatin Nandwani, Tahira Naseem et al.
ICML 2024posterarXiv:2402.02479
Controlled Decoding from Language Models
Sidharth Mudgal, Jong Lee, Harish Ganapathy et al.
ICML 2024posterarXiv:2310.17022
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration
Youngsoo Jang, Geon-Hyeong Kim, Byoungjip Kim et al.
ICML 2024poster
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.
ICML 2024posterarXiv:2403.08635
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
Banghua Zhu, Michael Jordan, Jiantao Jiao
ICML 2024posterarXiv:2401.16335
Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback
songyang gao, Qiming Ge, Wei Shen et al.
ICML 2024posterarXiv:2401.11458
MaxMin-RLHF: Alignment with Diverse Human Preferences
Souradip Chakraborty, Jiahao Qiu, Hui Yuan et al.
ICML 2024posterarXiv:2402.08925
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen, Chen Zhu, Jiuhai Chen et al.
ICML 2024posterarXiv:2402.07319
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan
ICML 2024posterarXiv:2403.00409
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.
ICML 2024poster
Towards Efficient Exact Optimization of Language Model Alignment
Haozhe Ji, Cheng Lu, Yilin Niu et al.
ICML 2024posterarXiv:2402.00856