2025 "length generalization" Papers
9 papers found
A Formal Framework for Understanding Length Generalization in Transformers
Xinting Huang, Andy Yang, Satwik Bhattamishra et al.
ICLR 2025posterarXiv:2410.02140
25
citations
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.
NeurIPS 2025posterarXiv:2505.21785
3
citations
Extrapolation by Association: Length Generalization Transfer In Transformers
Ziyang Cai, Nayoung Lee, Avi Schwarzschild et al.
NeurIPS 2025spotlightarXiv:2506.09251
7
citations
Generalizing Reasoning Problems to Longer Lengths
Changnan Xiao, Bing Liu
ICLR 2025poster
HoPE: Hybrid of Position Embedding for Long Context Vision-Language Models
Haoran Li, Yingjie Qin, Baoyuan Ou et al.
NeurIPS 2025oralarXiv:2505.20444
2
citations
Length Generalization via Auxiliary Tasks
Pranjal Awasthi, Anupam Gupta, Ravi Kumar
NeurIPS 2025poster
Looped Transformers for Length Generalization
Ying Fan, Yilun Du, Kannan Ramchandran et al.
ICLR 2025posterarXiv:2409.15647
33
citations
Mamba Modulation: On the Length Generalization of Mamba Models
Peng Lu, Jerry Huang, QIUHAO Zeng et al.
NeurIPS 2025poster
Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models
Benjamin Walker, Lingyi Yang, Nicola Muca Cirone et al.
NeurIPS 2025spotlightarXiv:2505.17761
6
citations