2025 Spotlight "transformer language models" Papers
2 papers found
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler Chang, Benjamin Bergen
NEURIPS 2025spotlightarXiv:2504.15471
1
citations
Extrapolation by Association: Length Generalization Transfer In Transformers
Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.
NEURIPS 2025spotlightarXiv:2506.09251
7
citations