Zhaolin Gao
4
Papers
31
Total Citations
Papers (4)
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
ICLR 2025
14
citations
$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training
NeurIPS 2025
10
citations
Value-Guided Search for Efficient Chain-of-Thought Reasoning
NeurIPS 2025
7
citations
Shoestring: Graph-Based Semi-Supervised Classification With Severely Limited Labeled Data
CVPR 2020
0
citations