Shi Feng
4
Papers
78
Total Citations
Papers (4)
Language Models Learn to Mislead Humans via RLHF
ICLR 2025arXiv
73
citations
Predicting Empirical AI Research Outcomes with Language Models
NeurIPS 2025
5
citations
Peer Prediction for Learning Agents
NeurIPS 2022
0
citations
Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation
ICML 2019
0
citations