2025 Poster "length generalization" Papers
9 papers found
A Formal Framework for Understanding Length Generalization in Transformers
Xinting Huang, Andy Yang, Satwik Bhattamishra et al.
ICLR 2025posterarXiv:2410.02140
25
citations
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.
NeurIPS 2025posterarXiv:2505.21785
3
citations
Generalizing Reasoning Problems to Longer Lengths
Changnan Xiao, Bing Liu
ICLR 2025poster
Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
Xiang Hu, Jiaqi Leng, Jun Zhao et al.
NeurIPS 2025posterarXiv:2504.16795
2
citations
Language Models Need Inductive Biases to Count Inductively
Yingshan Chang, Yonatan Bisk
ICLR 2025posterarXiv:2405.20131
19
citations
Length Generalization via Auxiliary Tasks
Pranjal Awasthi, Anupam Gupta, Ravi Kumar
NeurIPS 2025poster
Looped Transformers for Length Generalization
Ying Fan, Yilun Du, Kannan Ramchandran et al.
ICLR 2025posterarXiv:2409.15647
33
citations
Mamba Modulation: On the Length Generalization of Mamba Models
Peng Lu, Jerry Huang, QIUHAO Zeng et al.
NeurIPS 2025poster
Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization
Yu Huang, Zixin Wen, Aarti Singh et al.
NeurIPS 2025posterarXiv:2511.07378
3
citations