Pierre Ablin
7
Papers
61
Total Citations
Papers (7)
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
ICLR 2025
33
citations
The AdEMAMix Optimizer: Better, Faster, Older
ICLR 2025
23
citations
Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency
ICML 2025
5
citations
Optimization without Retraction on the Random Generalized Stiefel Manifold
ICML 2024
0
citations
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
ICML 2025
0
citations
Careful with that Scalpel: Improving Gradient Surgery with an EMA
ICML 2024
0
citations
How Smooth Is Attention?
ICML 2024
0
citations