Stuart Russell
6
Papers
34
Total Citations
Papers (6)
Monitoring Latent World States in Language Models with Propositional Probes
ICLR 2025
21
citations
Diffusion On Syntax Trees For Program Synthesis
ICLR 2025
9
citations
AssistanceZero: Scalably Solving Assistance Games
ICML 2025
4
citations
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
ICML 2024
0
citations
AI Alignment with Changing and Influenceable Reward Functions
ICML 2024
0
citations
Position: Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback
ICML 2024
0
citations