Ethan Perez
4
Papers
335
Total Citations
Papers (4)
Inverse Scaling: When Bigger Isn't Better
ICLR 2025
180
citations
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
ICLR 2024
133
citations
Failures to Find Transferable Image Jailbreaks Between Vision-Language Models
ICLR 2025
22
citations
Debating with More Persuasive LLMs Leads to More Truthful Answers
ICML 2024
0
citations