Provable Length Generalization in Sequence Prediction via Spectral Filtering

1
Citations
#606
in ICML 2025
of 3340 papers
6
Authors
1
Data Points

Abstract

We consider the problem of length generalization in sequence prediction. We define a new metric of performance in this setting – the Asymmetric-Regret– which measures regret against a benchmark predictor with longer context length than available to the learner. We continue by studying this concept through the lens of the spectral filter-ing algorithm. We present a gradient-based learn-ing algorithm that provably achieves length generalization for linear dynamical systems. We conclude with proof-of-concept experiments which are consistent with our theory.

Citation History

Jan 27, 2026
1