NEURIPS 2025 "multi-turn interaction" Papers
3 papers found
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
Yang Yue, Zhiqi Chen, Rui Lu et al.
NEURIPS 2025oralarXiv:2504.13837
483
citations
InterMT: Multi-Turn Interleaved Preference Alignment with Human Feedback
Boyuan Chen, Donghai Hong, Jiaming Ji et al.
NEURIPS 2025spotlightarXiv:2505.23950
1
citations
ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning
Ziyu Wan, Yunxiang Li, Xiaoyu Wen et al.
NEURIPS 2025posterarXiv:2503.09501
36
citations