Spotlight "mechanistic interpretability" Papers
2 papers found
A Implies B: Circuit Analysis in LLMs for Propositional Logical Reasoning
Guan Zhe Hong, Nishanth Dikkala, Enming Luo et al.
NeurIPS 2025spotlightarXiv:2411.04105
3
citations
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation
Aaditya Singh, Ted Moskovitz, Feilx Hill et al.
ICML 2024spotlight