"scaling laws" Papers
23 papers found
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu, Kaifeng Lyu, Jiazheng Li et al.
Emergence and scaling laws in SGD learning of shallow neural networks
Yunwei Ren, Eshaan Nichani, Denny Wu et al.
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Sean McLeish, John Kirchenbauer, David Miller et al.
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot, Seok Hoan Choi, Yuxiao Wen
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang, Depen Morwani, Nikhil Vyas et al.
Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models
Yuda Song, Hanlin Zhang, Carson Eisenach et al.
One Filters All: A Generalist Filter For State Estimation
Shiqi Liu, Wenhan Cao, Chang Liu et al.
Power Lines: Scaling laws for weight decay and batch size in LLM pre-training
Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.
Quantifying Elicitation of Latent Capabilities in Language Models
Elizabeth Donoway, Hailey Joren, Arushi Somani et al.
Scaling and evaluating sparse autoencoders
Leo Gao, Tom Dupre la Tour, Henk Tillman et al.
Scaling Laws For Scalable Oversight
Joshua Engels, David Baek, Subhash Kantamneni et al.
Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws
Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.
Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies
Brian Bartoldson, James Diffenderfer, Konstantinos Parasyris et al.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Shikai Qiu, Andres Potapczynski, Marc Finzi et al.
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis, Gregor Bachmann, Imanol Schlag et al.
NeRF-XL: NeRF at Any Scale with Multi-GPU
Ruilong Li, Sanja Fidler, Angjoo Kanazawa et al.
Scaling Laws for Fine-Grained Mixture of Experts
Jan Ludziejewski, Jakub Krajewski, Kamil Adamczewski et al.
Scaling Laws for the Value of Individual Data Points in Machine Learning
Ian Covert, Wenlong Ji, Tatsunori Hashimoto et al.
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Haowei Lin, Baizhou Huang, Haotian Ye et al.
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie, Guy Gur-Ari, Zohar Ringel
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Buyun Zhang, Liang Luo, Yuxin Chen et al.