α
Research
Alpha Leak
Conferences
Topics
Top Authors
Rankings
Browse All
EN
中
Home
/
Authors
/
He He
He He
3
Papers
78
Total Citations
Papers (3)
Language Models Learn to Mislead Humans via RLHF
ICLR 2025
arXiv
73
citations
Predicting Empirical AI Research Outcomes with Language Models
NeurIPS 2025
5
citations
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
ICML 2024
0
citations