Kaifeng Lyu
11
Papers
344
Total Citations
1
Affiliations
Affiliations
Tsinghua University
Papers (11)
Safety Alignment Should be Made More Than Just a Few Tokens Deep
ICLR 2025
277
citations
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
ICLR 2025
48
citations
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
ICLR 2025
13
citations
A Quadratic Synchronization Rule for Distributed Deep Learning
ICLR 2024
4
citations
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
NeurIPS 2025arXiv
2
citations
Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold
NeurIPS 2025
0
citations
Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate
NeurIPS 2020
0
citations
Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias
NeurIPS 2021
0
citations
On the SDEs and Scaling Rules for Adaptive Gradient Algorithms
NeurIPS 2022
0
citations
New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound
NeurIPS 2022
0
citations
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
NeurIPS 2022
0
citations