Xuezhou Zhang
10
Papers
19
Total Citations
Papers (10)
Accelerating RL for LLM Reasoning with Optimal Advantage Regression
NeurIPS 2025arXiv
12
citations
Avoiding exp(R) scaling in RLHF through Preference-based Exploration
NeurIPS 2025
3
citations
Efficient Reinforcement Learning in Probabilistic Reward Machines
AAAI 2025arXiv
2
citations
Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption
AAAI 2024
2
citations
Task-agnostic Exploration in Reinforcement Learning
NeurIPS 2020arXiv
0
citations
Neural Additive Models: Interpretable Machine Learning with Neural Nets
NeurIPS 2021arXiv
0
citations
Decentralized Gossip-Based Stochastic Bilevel Optimization over Communication Networks
NeurIPS 2022arXiv
0
citations
Provable Defense against Backdoor Policies in Reinforcement Learning
NeurIPS 2022arXiv
0
citations
Bandit Theory and Thompson Sampling-Guided Directed Evolution for Sequence Optimization
NeurIPS 2022arXiv
0
citations
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
NeurIPS 2023arXiv
0
citations