Difan Zou

16

Papers

88

Total Citations

Papers (16)

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

Faster Sampling via Stochastic Gradient Proximal Sampler

Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data

What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks

Improving Group Robustness on Spurious Correlation Requires Preciser Group Inference

Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks

Parallelized Autoregressive Visual Generation

Kernel Regression in Structured Non-IID Settings: Theory and Implications for Denoising Score Learning

Stochastic Variance-Reduced Hamilton Monte Carlo Methods

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

An Improved Analysis of Training Over-parameterized Deep Neural Networks

Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks

Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

The Benefits of Implicit Regularization from SGD in Least Squares Problems

Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation Regime

The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift