Forward pass produces a value; backprop sends gradients back through the same graph.
From the chapter: Chapter 9: Neural Networks
Glossary: backpropagation, chain rule, sigmoid, relu, gradient
People: paul werbos, geoffrey hinton
Transcript
A neural network is built from neurons. Each neuron performs two operations: a weighted sum, then a nonlinear activation.
Forward pass. Each input is multiplied by its weight, the products are summed, the bias is added, giving z. The activation function, here a sigmoid, converts z into the output a.
The loss compares the output to the target. The goal is to drive this loss to zero by adjusting the weights and the bias.
Backward pass. The chain rule sends the error back through the same graph, computing how much each parameter should change.
Red edges show parameters where increasing the value would raise the loss. Blue edges show the opposite. The thicker the edge, the larger the gradient.
When a weight changes, the forward output shifts and every gradient downstream updates with it.
That is one neuron. A modern neural network is millions of these wired together, with the same forward and backward rule applied at scale.