Kaiyue Wen
3
Papers
82
Total Citations
Papers (3)
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
ICLR 2025
48
citations
Overtrained Language Models Are Harder to Fine-Tune
ICML 2025
28
citations
Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?
ICML 2025
6
citations