Do Topological Characteristics Help in Knowledge Distillation?

0citations
PDFProject
0
Citations
#10
in ICML 2024
of 2635 papers
5
Authors
1
Data Points

Abstract

Knowledge distillation (KD) aims to transfer knowledge from larger (teacher) to smaller (student) networks. Previous studies focus on point-to-point or pairwise relationships in embedding features as knowledge and struggle to efficiently transfer relationships of complex latent spaces. To tackle this issue, we propose a novel KD method called TopKD, which considers the global topology of the latent spaces. We defineglobal topology knowledgeusing the persistence diagram (PD) that captures comprehensive geometric structures such as shape of distribution, multiscale structure and connectivity, and thetopology distillation lossfor teaching this knowledge. To make the PD transferable within reasonable computational time, we employ approximated persistence images of PDs. Through experiments, we support the benefits of using global topology as knowledge and demonstrate the potential of TopKD. Code is available at https://github.com/jekim5418/TopKD

Citation History

Jan 28, 2026
0