"pretraining data analysis" Papers
2 papers found
Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data
Xinyi Wang, Antonis Antoniades, Yanai Elazar et al.
ICLR 2025posterarXiv:2407.14985
77
citations
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Laura Ruis, Maximilian Mozes, Juhan Bae et al.
ICLR 2025posterarXiv:2411.12580
24
citations