2025 "human preference modeling" Papers
2 papers found
Advantage-Guided Distillation for Preference Alignment in Small Language Models
Shiping Gao, Fanqi Wan, Jiajian Guo et al.
ICLR 2025posterarXiv:2502.17927
4
citations
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations
Thomas Tian, Kratarth Goel
ICLR 2025posterarXiv:2503.20105
4
citations