2024 "next token prediction" Papers
3 papers found
Arrows of Time for Large Language Models
Vassilis Papadopoulos, Jérémie Wenger, Clement Hongler
ICML 2024poster
How do Transformers Perform In-Context Autoregressive Learning ?
Michael Sander, Raja Giryes, Taiji Suzuki et al.
ICML 2024poster
On the Origins of Linear Representations in Large Language Models
Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar et al.
ICML 2024poster