Banghua Zhu
8
Papers
384
Total Citations
Papers (8)
From Crowdsourced Data to High-quality Benchmarks: Arena-Hard and Benchbuilder Pipeline
ICML 2025
329
citations
How to Evaluate Reward Models for RLHF
ICLR 2025
50
citations
The Effective Horizon Explains Deep RL Performance in Stochastic Environments
ICLR 2024
5
citations
Iterative Data Smoothing: Mitigating Reward Overfitting and Overoptimization in RLHF
ICML 2024
0
citations
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
ICML 2024
0
citations
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
NeurIPS 2021
0
citations
Doubly-Robust Self-Training
NeurIPS 2023
0
citations
Towards Optimal Caching and Model Selection for Large Model Inference
NeurIPS 2023
0
citations