2025 Poster "scaling laws" Papers

25 papers found

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

Yiding Jiang, Allan Zhou, Zhili Feng et al.

ICLR 2025posterarXiv:2410.11820
35
citations

Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework

Thomson Yen, Andrew Siah, Haozhe Chen et al.

NEURIPS 2025posterarXiv:2503.21023
1
citations

Diffusion Beats Autoregressive in Data-Constrained Settings

Mihir Prabhudesai, Mengning Wu, Amir Zadeh et al.

NEURIPS 2025posterarXiv:2507.15857
24
citations

Emergence and scaling laws in SGD learning of shallow neural networks

Yunwei Ren, Eshaan Nichani, Denny Wu et al.

NEURIPS 2025posterarXiv:2504.19983
13
citations

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Sean McLeish, John Kirchenbauer, David Miller et al.

NEURIPS 2025posterarXiv:2502.06857
10
citations

High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws

Muhammed Ildiz, Halil Gozeten, Ege Taga et al.

ICLR 2025posterarXiv:2410.18837
13
citations

How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning

Arthur Jacot, Seok Hoan Choi, Yuxiao Wen

ICLR 2025posterarXiv:2407.05664
6
citations

How Does Critical Batch Size Scale in Pre-training?

Hanlin Zhang, Depen Morwani, Nikhil Vyas et al.

ICLR 2025posterarXiv:2410.21676
37
citations

Inverse Scaling: When Bigger Isn't Better

Joe Cavanagh, Andrew Gritsevskiy, Najoung Kim et al.

ICLR 2025posterarXiv:2306.09479
183
citations

Language models scale reliably with over-training and on downstream tasks

Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar et al.

ICLR 2025posterarXiv:2403.08540
77
citations

Learning in Compact Spaces with Approximately Normalized Transformer

Jörg Franke, Urs Spiegelhalter, Marianna Nezhurina et al.

NEURIPS 2025posterarXiv:2505.22014

Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models

Yuda Song, Hanlin Zhang, Carson Eisenach et al.

ICLR 2025posterarXiv:2412.02674

(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning

Margaret Li, Sneha Kudugunta, Luke Zettlemoyer

ICLR 2025poster
9
citations

One Filters All: A Generalist Filter For State Estimation

Shiqi Liu, Wenhan Cao, Chang Liu et al.

NEURIPS 2025posterarXiv:2509.20051
2
citations

Power Lines: Scaling laws for weight decay and batch size in LLM pre-training

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

NEURIPS 2025posterarXiv:2505.13738
15
citations

Quantifying Elicitation of Latent Capabilities in Language Models

Elizabeth Donoway, Hailey Joren, Arushi Somani et al.

NEURIPS 2025poster

Reasoning with Latent Thoughts: On the Power of Looped Transformers

Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li et al.

ICLR 2025posterarXiv:2502.17416
65
citations

RegMix: Data Mixture as Regression for Language Model Pre-training

Qian Liu, Xiaosen Zheng, Niklas Muennighoff et al.

ICLR 2025posterarXiv:2407.01492
99
citations

Scaling and evaluating sparse autoencoders

Leo Gao, Tom Dupre la Tour, Henk Tillman et al.

ICLR 2025posterarXiv:2406.04093
298
citations

Scaling Wearable Foundation Models

Girish Narayanswamy, Xin Liu, Kumar Ayush et al.

ICLR 2025posterarXiv:2410.13638
33
citations

TabDPT: Scaling Tabular Foundation Models on Real Data

Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.

NEURIPS 2025posterarXiv:2410.18164
15
citations

The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws

Tian Jin, Ahmed Imtiaz Humayun, Utku Evci et al.

ICLR 2025posterarXiv:2501.12486
1
citations

Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

Xiaoming Shi, Shiyu Wang, Yuqi Nie et al.

ICLR 2025posterarXiv:2409.16040
178
citations

U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models

Tung-Yu Wu, Melody Lo

ICLR 2025posterarXiv:2410.01692
5
citations

Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators

Wentao Zhang, Junliang Guo, Tianyu He et al.

ICLR 2025posterarXiv:2407.07356
7
citations