Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

22citations

arXiv:2406.02997

Citations

#497

in ICLR 2025

of 3827 papers

Authors

Data Points

Authors

Michael Scholkemper Xinyi Wu Ali Jadbabaie Michael Schaub

Topics

graph neural networks oversmoothing problem residual connections normalization layers batch normalization message-passing operator node representations graph signal processing

Abstract

Residual connections and normalization layers have become standard design choices for graph neural networks (GNNs), and were proposed as solutions to the mitigate the oversmoothing problem in GNNs. However, how exactly these methods help alleviate the oversmoothing problem from a theoretical perspective is not well understood. In this work, we provide a formal and precise characterization of (linearized) GNNs with residual connections and normalization layers. We establish that (a) for residual connections, the incorporation of the initial features at each layer can prevent the signal from becoming too smooth, and determines the subspace of possible node representations; (b) batch normalization prevents a complete collapse of the output embedding space to a one-dimensional subspace through the individual rescaling of each column of the feature matrix. This results in the convergence of node representations to the top-$k$ eigenspace of the message-passing operator; (c) moreover, we show that the centering step of a normalization layer -- which can be understood as a projection -- alters the graph signal in message-passing in such a way that relevant information can become harder to extract. We therefore introduce a novel, principled normalization layer called GraphNormv2 in which the centering step is learned such that it does not distort the original graph signal in an undesirable way. Experimental results confirm the effectiveness of our method.

Citation History

Jan 26, 2026

Jan 27, 2026

Feb 1, 2026

22+22