Message Passing Neural Network, Glossary, Textbook of AI

The message passing neural network (MPNN), introduced by Gilmer et al. in 2017 in the context of quantum chemistry, is not a specific architecture but a unifying framework showing that most graph neural networks share the same three-phase structure. By naming and separating the phases, the MPNN paper made it possible to compare GCNs, GATs, GraphSAGE, gated graph networks, and interaction networks as instances of the same template, and the message-passing language has since become the standard way to describe and implement GNNs.

Phase 1: Message. For each edge $(v, u)$ at layer $l$, compute a message:

$$m_{vu}^{(l)} = M_l\!\left(h_v^{(l)},\, h_u^{(l)},\, e_{vu}\right)$$

where $e_{vu}$ is an optional edge feature (bond type, distance, weight) and $M_l$ is a learned function, typically an MLP. Each node aggregates incoming messages with a permutation-invariant operator:

$$m_v^{(l+1)} = \sum_{u \in \mathcal{N}(v)} m_{vu}^{(l)}$$

Sum is the canonical choice because it preserves multiset distinctions (the basis of the Weisfeiler–Lehman analysis of GNN expressivity); mean and max are also common.

Phase 2: Update. Each node updates its hidden state from the aggregated message:

$$h_v^{(l+1)} = U_l\!\left(h_v^{(l)},\, m_v^{(l+1)}\right)$$

The update $U_l$ is usually an MLP, GRU, or LSTM cell; recurrent updates allow long message-passing chains without exploding parameter counts.

Phase 3: Readout. After $L$ rounds, a graph-level prediction is produced from all final node states:

$$\hat y = R\!\left(\{h_v^{(L)} : v \in V\}\right)$$

The readout $R$ must be permutation invariant. Sum, mean, max, attention-weighted pooling (Set2Set), and sort pooling all appear in practice.

The framework recovers familiar architectures by choice of $M$, $U$, and $R$. Setting $M(h_v, h_u, e) = \tfrac{1}{\sqrt{\tilde d_v \tilde d_u}} W h_u$ with identity update gives the GCN; setting $M(h_v, h_u, e) = \alpha_{vu} W h_u$ with learned attention $\alpha_{vu}$ gives the GAT; using GRU updates gives the gated graph neural network.

The MPNN paper applied this template to molecular property prediction on the QM9 dataset (130k small organic molecules) and beat all hand-crafted descriptor methods on chemical accuracy targets such as atomisation energy. It established the benchmark that subsequent GNN papers competed against and demonstrated that learned representations on molecular graphs outperform decades of cheminformatics feature engineering. Today, MPNNs underlie most production AI systems in chemistry, materials science, and drug discovery, and "message passing" is the standard verb for what GNNs do.

Discussed in:

Chapter 9: Neural Networks, Graph Neural Networks

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).