Poster by Ruotian Ma Papers
2 papers found
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
Jiaqi Chen, Bang Zhang, Ruotian Ma et al.
NeurIPS 2025posterarXiv:2504.19162
21
citations
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
Mengru Wang, Xingyu Chen, Yue Wang et al.
NeurIPS 2025posterarXiv:2505.14681
8
citations