Yudong Chen

8

Papers

21

Total Citations

Papers (8)

LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently

Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration

Effectiveness of Constant Stepsize in Markovian LSA and Statistical Inference

Stable Offline Value Function Learning with Bisimulation-based Representations

Minimally Modifying a Markov Game to Achieve Any Nash Equilibrium and Value

Medium-Difficulty Samples Constitute Smoothed Decision Boundary for Knowledge Distillation on Pruned Datasets

Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces

The $\varphi$ Curve: The Shape of Generalization through the Lens of Norm-based Capacity Control