Aaron Mueller
4
Papers
504
Total Citations
Papers (4)
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
ICLR 2025
252
citations
Inverse Scaling: When Bigger Isn't Better
ICLR 2025
180
citations
Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics
ICLR 2025
63
citations
MIB: A Mechanistic Interpretability Benchmark
ICML 2025
9
citations