Shenao Zhang
7
Papers
13
Total Citations
Papers (7)
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
ICML 2025arXiv
8
citations
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
ICML 2025arXiv
5
citations
Adaptive-Gradient Policy Optimization: Enhancing Policy Learning in Non-Smooth Differentiable Simulations
ICML 2024
0
citations
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
ICML 2024
0
citations
Conservative Dual Policy Optimization for Efficient Model-Based Reinforcement Learning
NeurIPS 2022arXiv
0
citations
Maximize to Explore: One Objective Function Fusing Estimation, Planning, and Exploration
NeurIPS 2023arXiv
0
citations
Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms
NeurIPS 2023arXiv
0
citations