Tengyu Ma

39

Papers

1,288

Total Citations

Papers (39)

Matrix Completion has No Spurious Local Minimum

NeurIPS 2016arXiv

Large Language Models as Tool Makers

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

On the Optimization Landscape of Tensor Decompositions

NeurIPS 2017arXiv

Sum-of-Squares Lower Bounds for Sparse PCA

NeurIPS 2015arXiv

A Non-generative Framework and Convex Relaxations for Unsupervised Learning

NeurIPS 2016arXiv

Robust and On-the-fly Dataset Denoising for Image Classification

Trash to Treasure: Low-Light Object Detection via Decomposition-and-Aggregation

Rethinking Reconstruction and Denoising in the Dark: New Perspective, General Architecture and Beyond

Linguistic Calibration of Long-Form Generations

Toward Fast, Flexible, and Robust Low-Light Image Enhancement

Provable Guarantees for Self-Supervised Deep Learning with Spectral Contrastive Loss

Safe Reinforcement Learning by Imagining the Near Future

Why Do Pretrained Language Models Help in Downstream Tasks? An Analysis of Head and Prompt Tuning

Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration

Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations

Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature

Label Noise SGD Provably Prefers Flat Global Minimizers

Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers

Beyond Separability: Analyzing the Linear Transferability of Contrastive Representations to Related Subpopulations

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

What is the Inductive Bias of Flatness Regularization? A Study of Deep Matrix Factorization Models

Data Selection for Language Models via Importance Resampling

Beyond NTK with Vanilla Gradient Descent: A Mean-Field Analysis of Neural Networks with Polynomial Width, Samples, and Time

DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

Online Learning of Eigenvectors

Provable Algorithms for Inference in Topic Models

Generalization and Equilibrium in Generative Adversarial Nets (GANs)

Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel

Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks

Verified Uncertainty Calibration

Federated Accelerated Stochastic Gradient Descent

Model-based Adversarial Meta-Reinforcement Learning

MOPO: Model-based Offline Policy Optimization

Self-training Avoids Using Spurious Features Under Domain Shift

Beyond Lazy Training for Over-parameterized Tensor Decomposition