David Krueger
10
Papers
30
Total Citations
Papers (10)
Pitfalls of Evidence-Based AI Policy
ICLR 2025arXiv
14
citations
Detecting High-Stakes Interactions with Activation Probes
NeurIPS 2025
13
citations
From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization
NeurIPS 2025
1
citations
Input Space Mode Connectivity in Deep Neural Networks
ICLR 2025
1
citations
PoisonBench: Assessing Language Model Vulnerability to Poisoned Preference Data
ICML 2025
1
citations
Implicit meta-learning may lead language models to trust more reliable sources
ICML 2024
0
citations
Defining and Characterizing Reward Gaming
NeurIPS 2022
0
citations
Thinker: Learning to Plan and Act
NeurIPS 2023
0
citations
A Closer Look at Memorization in Deep Networks
ICML 2017
0
citations
Neural Autoregressive Flows
ICML 2018
0
citations