"induction heads" Papers
5 papers found
From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers
Ryotaro Kawata, Yujin Song, Alberto Bietti et al.
NeurIPS 2025spotlightarXiv:2512.18634
1
citations
Selective induction Heads: How Transformers Select Causal Structures in Context
Francesco D'Angelo, francesco croce, Nicolas Flammarion
ICLR 2025posterarXiv:2509.08184
4
citations
What One Cannot, Two Can: Two-Layer Transformers Provably Represent Induction Heads on Any-Order Markov Chains
Chanakya Ekbote, Ashok Vardhan Makkuva, Marco Bondaschi et al.
NeurIPS 2025spotlightarXiv:2508.07208
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Roziere et al.
ICML 2024poster
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
ICML 2024poster