Yuanzhao Zhai
4
Papers
24
Total Citations
Papers (4)
Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models
AAAI 2025
21
citations
Correcting Large Language Model Behavior via Influence Function
AAAI 2025
3
citations
Optimistic Model Rollouts for Pessimistic Offline Policy Optimization
AAAI 2024arXiv
0
citations
Iterative Regularized Policy Optimization with Imperfect Demonstrations
ICML 2024
0
citations