by Diyuan Wu Papers
2 papers found
Attention with Trained Embeddings Provably Selects Important Tokens
Diyuan Wu, Aleksandr Shevchenko, Samet Oymak et al.
NeurIPS 2025posterarXiv:2505.17282
Neural Collapse Beyond the Unconstrained Features Model: Landscape, Dynamics, and Generalization in the Mean-Field Regime
Diyuan Wu, Marco Mondelli
ICML 2025spotlightarXiv:2501.19104