"preference optimization methods" Papers
3 papers found
Calibrating Translation Decoding with Quality Estimation on LLMs
Di Wu, Yibin Lei, Christof Monz
NEURIPS 2025posterarXiv:2504.19044
Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective
Ruichen Shao, Bei Li, Gangao Liu et al.
ICLR 2025oralarXiv:2502.14340
7
citations
Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking
Benjamin Feuer, Micah Goldblum, Teresa Datta et al.
ICLR 2025posterarXiv:2409.15268
27
citations