LoSplit: Loss-Guided Dynamic Split for Training-Time Defense Against Graph Backdoor Attacks

0citations

citations

#3347

in NEURIPS 2025

of 5858 papers

Top Authors

Data Points

Top Authors

Di Jin Yuxiang Zhang Bingdao Feng Xiaobao Wang Dongxiao He Zhen Wang

Abstract

Graph Neural Networks (GNNs) are vulnerable to backdoor attacks. Existing defenses primarily rely on detecting structural anomalies, distributional outliers, or perturbation-induced prediction instability, which struggle to handle the more subtle, feature-based attacks that do not introduce obvious topological changes. Our empirical analysis reveals that both structure-based and feature-based attacks not only cause early loss convergence of target nodes but also induce a class-coherent loss drift, where this early convergence gradually spreads to nearby clean nodes, leading to significant distribution overlap. To address this issue, we propose LoSplit, the first training-time defense framework in graph that leverages this early-stage loss drift to accurately split target nodes. Our method dynamically selects epochs with maximal loss divergence, clusters target nodes via Gaussian Mixture Models (GMM), and applies a Decoupling-Forgetting strategy to break the association between target nodes and malicious label. Extensive experiments on multiple real-world datasets demonstrate the effectiveness of our approach, significantly reducing attack success rates while maintaining high clean accuracy across diverse backdoor attack strategies.

Citation History

Jan 25, 2026

Jan 26, 2026

Jan 28, 2026