Towards Realistic Model Selection for Semi-supervised Learning

0citations

PDF

Citations

#10

in ICML 2024

of 2635 papers

Authors

Data Points

Authors

Muyang Li Xiaobo Xia Runze Wu Fengming Huang Jun Yu Bo Han Tongliang Liu

Topics

semi-supervised learning model selection generalization error margin distribution spectral complexity validation-free evaluation

Abstract

Semi-supervised Learning (SSL) has shown remarkable success in applications with limited supervision. However, due to the scarcity of labels in the training process, SSL algorithms are known to be impaired by the lack of proper model selection, as splitting a validation set will further reduce the limited labeled data, and the size of the validation set could be too small to provide a reliable indication to the generalization error. Therefore, we seek alternatives that do not rely on validation data to probe the generalization performance of SSL models. Specifically, we find that the distinct margin distribution in SSL can be effectively utilized in conjunction with the model's spectral complexity, to provide a non-vacuous indication of the generalization error. Built upon this, we propose a novel model selection method, specifically tailored for SSL, known asSpectral-normalizedLabeled-marginMinimization (SLAM). We prove that the model selected by SLAM has upper-bounded differences w.r.t. the best model within the search space. In addition, comprehensive experiments showcase that SLAM can achieve significant improvements compared to its counterparts, verifying its efficacy from both theoretical and empirical standpoints.

Citation History

Jan 28, 2026