Atticus Geiger
5
Papers
109
Total Citations
Papers (5)
AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders
ICML 2025arXiv
100
citations
MIB: A Mechanistic Interpretability Benchmark
ICML 2025arXiv
9
citations
Causal Abstractions of Neural Networks
NeurIPS 2021arXiv
0
citations
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior
NeurIPS 2022arXiv
0
citations
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca
NeurIPS 2023arXiv
0
citations