Yuanzhi Li
38
Papers
995
Total Citations
Papers (38)
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
NeurIPS 2017arXiv
674
citations
LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain
NeurIPS 2016arXiv
133
citations
Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process
ICLR 2025
98
citations
Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls
NeurIPS 2017arXiv
54
citations
Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
NeurIPS 2016arXiv
30
citations
Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods
NeurIPS 2016arXiv
6
citations
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
ICML 2024
0
citations
Algorithms and matching lower bounds for approximately-convex optimization
NeurIPS 2016
0
citations
Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning
AAAI 2024arXiv
0
citations
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
NeurIPS 2019
0
citations
When Is Generalizable Reinforcement Learning Tractable?
NeurIPS 2021
0
citations
Local Signal Adaptivity: Provable Feature Learning in Neural Networks Beyond Kernels
NeurIPS 2021
0
citations
Towards Understanding the Mixture-of-Experts Layer in Deep Learning
NeurIPS 2022
0
citations
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
NeurIPS 2022
0
citations
Learning (Very) Simple Generative Models Is Hard
NeurIPS 2022
0
citations
Vision Transformers provably learn spatial structure
NeurIPS 2022
0
citations
Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals
NeurIPS 2023
0
citations
How Does Adaptive Optimization Impact Local Neural Network Geometry?
NeurIPS 2023
0
citations
SPRING: Studying Papers and Reasoning to play Games
NeurIPS 2023
0
citations
The probability flow ODE is provably fast
NeurIPS 2023
0
citations
Recovery guarantee of weighted low-rank approximation via alternating minimization
ICML 2016
0
citations
Doubly Accelerated Methods for Faster CCA and Generalized Eigendecomposition
ICML 2017
0
citations
Faster Principal Component Regression and Stable Matrix Chebyshev Approximation
ICML 2017
0
citations
Follow the Compressed Leader: Faster Online Learning of Eigenvectors and Faster MMWU
ICML 2017
0
citations
Near-Optimal Design of Experiments via Regret Minimization
ICML 2017
0
citations
Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
ICML 2017
0
citations
Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits
ICML 2018
0
citations
An Alternative View: When Does SGD Escape Local Minima?
ICML 2018
0
citations
The Well-Tempered Lasso
ICML 2018
0
citations
A Convergence Theory for Deep Learning via Over-Parameterization
ICML 2019
0
citations
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
NeurIPS 2018
0
citations
Online Improper Learning with an Approximation Oracle
NeurIPS 2018
0
citations
NEON2: Finding Local Minima via First-Order Oracles
NeurIPS 2018
0
citations
On the Convergence Rate of Training Recurrent Neural Networks
NeurIPS 2019
0
citations
Complexity of Highly Parallel Non-Smooth Convex Optimization
NeurIPS 2019
0
citations
What Can ResNet Learn Efficiently, Going Beyond Kernels?
NeurIPS 2019
0
citations
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
NeurIPS 2019
0
citations
Can SGD Learn Recurrent Neural Networks with Provable Generalization?
NeurIPS 2019
0
citations