"transformer language models" Papers
5 papers found
Extrapolation by Association: Length Generalization Transfer In Transformers
Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.
NeurIPS 2025spotlightarXiv:2506.09251
7
citations
Matrix Product Sketching via Coordinated Sampling
Majid Daliri, Juliana Freire, Danrong Li et al.
ICLR 2025posterarXiv:2501.17836
2
citations
Residual Stream Analysis with Multi-Layer SAEs
Tim Lawson, Lucy Farnik, Conor Houghton et al.
ICLR 2025posterarXiv:2409.04185
11
citations
Codebook Features: Sparse and Discrete Interpretability for Neural Networks
Alex Tamkin, Mohammad Taufeeque, Noah Goodman
ICML 2024poster
Observable Propagation: Uncovering Feature Vectors in Transformers
Jacob Dunefsky, Arman Cohan
ICML 2024poster