Lihong Li
4
Papers
0
Total Citations
Papers (4)
Ask a Strong LLM Judge when Your Reward Model is Uncertain
NeurIPS 2025arXiv
0
citations
Off-Policy Evaluation via the Regularized Lagrangian
NeurIPS 2020arXiv
0
citations
CoinDICE: Off-Policy Confidence Interval Estimation
NeurIPS 2020arXiv
0
citations
Escaping the Gravitational Pull of Softmax
NeurIPS 2020
0
citations