by Shuozhe Li Papers
2 papers found
An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning
Haoran Xu, Shuozhe Li, Harshit Sikchi et al.
ICLR 2025posterarXiv:2504.13368
2
citations
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning
Ruiyang Zhou, Shuozhe Li, Amy Zhang et al.
NeurIPS 2025poster
4
citations