Paper "supervised fine-tuning" Papers
2 papers found
Conference
PokerBench: Training Large Language Models to Become Professional Poker Players
Richard Zhuang, Akshat Gupta, Richard Yang et al.
AAAI 2025paperarXiv:2501.08328
8
citations
Preference Ranking Optimization for Human Alignment
Feifan Song, Bowen Yu, Minghao Li et al.
AAAI 2024paperarXiv:2306.17492
335
citations