[ 2024.03.30 / 9 min read ]
AI Fundamentals

Forward vs. Backward Propagation: How AI Learns

The Two-Stroke Engine of AI

Training a neural network is a continuous cycle of making guesses and being told how wrong those guesses were. This cycle is driven by two critical processes: Forward Propagation and Backward Propagation.

1. Forward Propagation: Making the Guess

During forward propagation, data flows from the input layer through the hidden layers to the output layer. Each neuron performs a simple weighted sum of its inputs, adds a bias, and passes the result through an activation function.

  • Input: Raw data (pixels, text tokens, etc.).
  • Transformation: Weights and Biases are applied.
  • Output: A prediction (e.g., "This image is a car with 92% confidence").

At the end of this stage, we calculate the Loss—the mathematical difference between the AI's prediction and the actual ground-truth label.

2. Backward Propagation: Learning from Mistakes

This is where the actual "learning" happens. Backpropagation uses the Chain Rule from calculus to calculate how much each weight in the network contributed to the final error.

THE ANALOGY: Imagine you're shooting an arrow. Forward prop is the shot. Backprop is seeing where the arrow landed, and adjusting your posture and grip for the next shot.

3. Gradient Descent

Once we know the error contribution (the gradient) for every weight, we use an optimizer (like SGD or Adam) to nudge the weights in the opposite direction of the error. This is called Gradient Descent.

The Loop

By repeating this thousands of times (across several "Epochs"), the network gradually minimizes its loss and becomes more accurate.

Summary Comparison

Feature Forward Prop Backward Prop
Direction Input → Output Output ← Input
Goal Prediction Correction
Key Math Matrix Multiplications Partial Derivatives