"scaling laws" Papers

19 篇论文

Data Mixing Can Induce Phase Transitions in Knowledge Acquisition

Xinran Gu, Kaifeng Lyu, Jiazheng Li et al.

NeurIPS 2025spotlightarXiv:2505.18091
2
citations

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Sean McLeish, John Kirchenbauer, David Miller et al.

NeurIPS 2025posterarXiv:2502.06857
10
citations

How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning

Arthur Jacot, Seok Hoan Choi, Yuxiao Wen

ICLR 2025posterarXiv:2407.05664
6
citations

How Does Critical Batch Size Scale in Pre-training?

Hanlin Zhang, Depen Morwani, Nikhil Vyas et al.

ICLR 2025posterarXiv:2410.21676
37
citations

Power Lines: Scaling laws for weight decay and batch size in LLM pre-training

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

NeurIPS 2025posterarXiv:2505.13738
15
citations

Quantifying Elicitation of Latent Capabilities in Language Models

Elizabeth Donoway, Hailey Joren, Arushi Somani et al.

NeurIPS 2025poster

Scaling and evaluating sparse autoencoders

Leo Gao, Tom Dupre la Tour, Henk Tillman et al.

ICLR 2025posterarXiv:2406.04093
298
citations

Scaling Laws For Scalable Oversight

Joshua Engels, David Baek, Subhash Kantamneni et al.

NeurIPS 2025spotlightarXiv:2504.18530
4
citations

Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies

Brian Bartoldson, James Diffenderfer, Konstantinos Parasyris et al.

ICML 2024poster

A Tale of Tails: Model Collapse as a Change of Scaling Laws

Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.

ICML 2024poster

Compute Better Spent: Replacing Dense Layers with Structured Matrices

Shikai Qiu, Andres Potapczynski, Marc Finzi et al.

ICML 2024poster

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Johan Obando Ceron, Ghada Sokar, Timon Willi et al.

ICML 2024spotlight

Navigating Scaling Laws: Compute Optimality in Adaptive Model Training

Sotiris Anagnostidis, Gregor Bachmann, Imanol Schlag et al.

ICML 2024spotlight

NeRF-XL: NeRF at Any Scale with Multi-GPU

Ruilong Li, Sanja Fidler, Angjoo Kanazawa et al.

ECCV 2024poster

Scaling Laws for Fine-Grained Mixture of Experts

Jan Ludziejewski, Jakub Krajewski, Kamil Adamczewski et al.

ICML 2024poster

Scaling Laws for the Value of Individual Data Points in Machine Learning

Ian Covert, Wenlong Ji, Tatsunori Hashimoto et al.

ICML 2024poster

Selecting Large Language Model to Fine-tune via Rectified Scaling Law

Haowei Lin, Baizhou Huang, Haotian Ye et al.

ICML 2024poster

Towards Understanding Inductive Bias in Transformers: A View From Infinity

Itay Lavie, Guy Gur-Ari, Zohar Ringel

ICML 2024poster

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Buyun Zhang, Liang Luo, Yuxin Chen et al.

ICML 2024poster