"post-training alignment" Papers
2 papers found
Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations
Thomas Tian, Kratarth Goel
ICLR 2025posterarXiv:2503.20105
4
citations
Self Iterative Label Refinement via Robust Unlabeled Learning
Hikaru Asano, Tadashi Kozuno, Yukino Baba
NeurIPS 2025posterarXiv:2502.12565
1
citations