2025 "scaling laws" Papers
15 papers found
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu, Kaifeng Lyu, Jiazheng Li et al.
NeurIPS 2025spotlightarXiv:2505.18091
2
citations
Diffusion Beats Autoregressive in Data-Constrained Settings
Mihir Prabhudesai, Mengning Wu, Amir Zadeh et al.
NeurIPS 2025posterarXiv:2507.15857
24
citations
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
NeurIPS 2025posterarXiv:2504.19983
13
citations
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Miller et al.
NeurIPS 2025posterarXiv:2502.06857
10
citations
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot, Seok Hoan Choi, Yuxiao Wen
ICLR 2025posterarXiv:2407.05664
6
citations
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang, Depen Morwani, Nikhil Vyas et al.
ICLR 2025posterarXiv:2410.21676
37
citations
Learning in Compact Spaces with Approximately Normalized Transformer
Jörg Franke, Urs Spiegelhalter, Marianna Nezhurina et al.
NeurIPS 2025posterarXiv:2505.22014
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Yuda Song, Hanlin Zhang, Carson Eisenach et al.
ICLR 2025posterarXiv:2412.02674
One Filters All: A Generalist Filter For State Estimation
Shiqi Liu, Wenhan Cao, Chang Liu et al.
NeurIPS 2025posterarXiv:2509.20051
2
citations
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
NeurIPS 2025posterarXiv:2505.13738
15
citations
Quantifying Elicitation of Latent Capabilities in Language Models
Elizabeth Donoway, Hailey Joren, Arushi Somani et al.
NeurIPS 2025poster
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu, Xiaosen Zheng, Niklas Muennighoff et al.
ICLR 2025posterarXiv:2407.01492
99
citations
Scaling and evaluating sparse autoencoders
Leo Gao, Tom Dupre la Tour, Henk Tillman et al.
ICLR 2025posterarXiv:2406.04093
298
citations
Scaling Laws For Scalable Oversight
Joshua Engels, David Baek, Subhash Kantamneni et al.
NeurIPS 2025spotlightarXiv:2504.18530
4
citations
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
NeurIPS 2025spotlightarXiv:2504.09597
5
citations