by Jiacai Liu Papers
3 papers found
$\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee
Wenye Li, Jiacai Liu, Ke Wei
ICLR 2025poster
3
citations
DAPO : Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage-Based Policy Optimization
Jiacai Liu, Chaojie Wang, Chris Liu et al.
NeurIPS 2025spotlight
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Jiacai Liu, Wenye Li, Dachao Lin et al.
NeurIPS 2025posterarXiv:2311.01104
4
citations