NEURIPS Poster "length generalization" Papers
6 papers found
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
Yi Hu, Shijia Kang, Haotong Yang et al.
NEURIPS 2025posterarXiv:2502.11525
4
citations
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Mayank Jobanputra, Yana Veitsman, Yash Sarrof et al.
NEURIPS 2025posterarXiv:2505.21785
3
citations
Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access
Xiang Hu, Jiaqi Leng, Jun Zhao et al.
NEURIPS 2025posterarXiv:2504.16795
2
citations
Length Generalization via Auxiliary Tasks
Pranjal Awasthi, Anupam Gupta, Ravi Kumar
NEURIPS 2025poster
Mamba Modulation: On the Length Generalization of Mamba Models
Peng Lu, Jerry Huang, QIUHAO Zeng et al.
NEURIPS 2025poster
Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization
Yu Huang, Zixin Wen, Aarti Singh et al.
NEURIPS 2025posterarXiv:2511.07378
3
citations