Hanlin Zhang

4

Papers

37

Total Citations

Papers (4)

How Does Critical Batch Size Scale in Pre-training?

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Eliminating Position Bias of Language Models: A Mechanistic Approach

EvoLM: In Search of Lost Language Model Training Dynamics

NeurIPS 2025arXiv