Matteo Pagliardini
4
Papers
35
Total Citations
Papers (4)
The AdEMAMix Optimizer: Better, Faster, Older
ICLR 2025
23
citations
CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference
ICLR 2025
12
citations
DOGE: Domain Reweighting with Generalization Estimation
ICML 2024
0
citations
Fast Attention Over Long Sequences With Dynamic Sparse Flash Attention
NeurIPS 2023
0
citations