Laplace Sample Information: Data Informativeness Through a Bayesian Lens

0citations

Citations

#2011

in ICLR 2025

of 3827 papers

Authors

Data Points

Authors

Johannes Kaiser Kristian Schwethelm Daniel Rueckert Georgios Kaissis

Abstract

Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose $\text{\textit{Laplace Sample Information}}$ ($\mathsf{LSI}$) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings.$\mathsf{LSI}$ leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset.We experimentally show that $\mathsf{LSI}$ is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty.We demonstrate these capabilities of $\mathsf{LSI}$ on image and text data in supervised and unsupervised settings.Moreover, we show that $\mathsf{LSI}$ can be computed efficiently through probes and transfers well to the training of large models.

Citation History

Jan 26, 2026

Jan 27, 2026