Implicit meta-learning may lead language models to trust more reliable sources

0citations

PDF

Citations

#10

in ICML 2024

of 2635 papers

Authors

Data Points

Authors

Dmitrii Krasheninnikov Egor Krasheninnikov Bruno Mlodozeniec Tegan Maharaj David Krueger

Topics

implicit meta-learning large language models document usefulness indicators fine-tuning datasets knowledge storage probing model controllability vision task adaptation

Abstract

We demonstrate that large language models (LLMs) may learn indicators of document usefulness and modulate their updates accordingly. We introduce random strings ("tags") as indicators of usefulness in a synthetic fine-tuning dataset. Fine-tuning on this dataset leads toimplicit meta-learning (IML): in further fine-tuning, the model updates to make more use of text that is tagged as useful. We perform a thorough empirical investigation of this phenomenon, finding (among other things) that (i) it occurs in both pretrained LLMs and those trained from scratch, as well as on a vision task, and (ii) larger models and smaller batch sizes tend to give more IML. We also use probing to examine how IML changes the way models store knowledge in their parameters. Finally, we reflect on what our results might imply about the capabilities, risks, and controllability of future AI systems.

Citation History

Jan 28, 2026