Laplace Sample Information: Data Informativeness Through a Bayesian Lens

0citations
0
Citations
#2011
in ICLR 2025
of 3827 papers
4
Authors
3
Data Points

Abstract

Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose $\text{\textit{Laplace Sample Information}}$ ($\mathsf{LSI}$) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings.$\mathsf{LSI}$ leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset.We experimentally show that $\mathsf{LSI}$ is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty.We demonstrate these capabilities of $\mathsf{LSI}$ on image and text data in supervised and unsupervised settings.Moreover, we show that $\mathsf{LSI}$ can be computed efficiently through probes and transfers well to the training of large models.

Citation History

Jan 26, 2026
0
Jan 26, 2026
0
Jan 27, 2026
0