DAY 79

Training Networks 🚲📉🏁

Learn how neural networks train epoch by epoch — just like learning to ride a bicycle — and understand why validation data is the only honest way to know if your model is truly getting better.

⏱ 15 mins
⚡ +50 XP
Training Networks 🚲📉🏁

Day 79: Training Neural Networks — Each Epoch, Less Wrong

Why Should I Care?

Every AI model you have ever used — Spotify recommendations, Google autocomplete, Amazon product rankings — was trained using the exact process you are about to learn. Training is not magic. It is repetition. Each attempt makes the model slightly less wrong than before. Once you understand how training works, you understand how all of modern AI works.

Core Concept

Training a neural network is exactly like learning to ride a bicycle. The first attempt is a disaster — the model has random weights and makes terrible predictions, just like falling off the bike on day one. Each attempt builds on the last. By attempt 10 things are improving but still wobbly. By attempt 50 the model just knows. That muscle memory is the trained weights. You cannot explain it — the model learned it through repetition.

How It Works

Each epoch is one full practice attempt. Inside every epoch there are four steps that repeat automatically. Feed data forward and get a prediction. Measure the loss — how much did it wobble? Backpropagate and adjust the balance. Move to the next epoch with a little less wobble. model.fit() runs this entire loop as many times as you tell it. The key is to always watch two loss lines — training loss and validation loss — to know if your model is actually learning or just memorising.


Each Epoch — 4 steps:

1. Feed data forward   -->  prediction made
2. Measure loss        -->  how much wobble?
3. Backpropagate       -->  adjust balance
4. Next epoch          -->  less wobble than before

Val Loss across epochs:
  Epoch  1  -->  3842  (no idea what to do)
  Epoch 10  -->   892  (improving but unstable)
  Epoch 25  -->   124  (pattern emerging)
  Epoch 50  -->  18.34 (model just knows — ready to predict)

Reading the loss curves:
  Train loss drops + Val loss drops  -->  healthy training
  Train loss drops + Val loss rises  -->  overfitting warning

Real World Connection

Think about how PUBG matchmaking works. The model predicts your skill level and matches you with similar players. Early in training the predictions are all over the place — you get matched with beginners and pros randomly. After thousands of training epochs on millions of game results, the model gets precise. The validation data is like testing matchmaking on brand new players it has never seen before. If it works on them too, the training was real learning — not just memorisation.

Examples


import numpy as np
import tensorflow as tf

X = np.array([[1],[2],[3],[4],[5],[6],[7],[8]], dtype=float)
y = np.array([45,55,62,71,80,85,91,97],        dtype=float)

# Train/test split — separate road for validation
X_train, X_test = X[:6], X[6:]
y_train, y_test = y[:6], y[6:]

# Deeper network — more capacity to learn complex patterns
model = tf.keras.Sequential([
    tf.keras.layers.Dense(16, activation="relu", input_shape=(1,)),
    tf.keras.layers.Dense(8,  activation="relu"),
    tf.keras.layers.Dense(1)
])

model.compile(optimizer="adam", loss="mse")
print("--- Training Progress ---")

# Manual epoch tracking — see loss at each stage
for epoch in [1, 10, 25, 50]:
    model.fit(X_train, y_train, epochs=1, verbose=0)
    loss = model.evaluate(X_test, y_test, verbose=0)
    print(f"Epoch {epoch:>2} | Val Loss: {loss:.2f}")

# OUTPUT:
# --- Training Progress ---
# Epoch  1 | Val Loss: 3842.17
# Epoch 10 | Val Loss:  892.43
# Epoch 25 | Val Loss:  124.81
# Epoch 50 | Val Loss:   18.34
#
# Final Prediction (9h): 99.2

Common Mistakes

Two mistakes that beginners make constantly. First, stopping training too early because training loss looks low. Second, training without any validation data — which means training completely blind.


-- MISTAKE 1: Stopping too early --

WRONG:
  epochs=3  -->  training loss still high
  Thinking: "Good enough — let''s predict"

CORRECT:
  Train until validation loss plateaus —
  not just until training loss looks low.
  Two bicycle attempts is not learning —
  undertrained models guess randomly.

NOTE: Always watch val loss, not just train loss.


-- MISTAKE 2: No validation data --

WRONG:
  model.fit(X_train, y_train, epochs=50)
  Result: no validation — training completely blind

CORRECT:
  model.fit(X_train, y_train, epochs=50,
            validation_data=(X_test, y_test))

NOTE: "Without validation_data you can''t
       detect overfitting as it happens."
RULE: Always pass validation_data.
      Always watch val loss, not just train loss.

Mini Challenge

Mini Challenge

Run the training code from the Examples section in Google Colab. After training for 50 epochs, change epochs to 3 and run it again. Compare the final validation loss from both runs side by side. Then add one more line at the end that uses model.predict to predict the output for 10 hours of study. Print the result. You just trained, compared, and used a real neural network from scratch.

Quick Quiz

Q1: What happens during each epoch of neural network training? A1: The model feeds data forward, measures the loss, backpropagates to adjust weights, and repeats — getting a little less wrong each time. Q2: If training loss is dropping but validation loss is rising, what is happening? A2: The model is overfitting — it is memorising the training data instead of learning patterns that generalise. Q3: Why must you always pass validation_data to model.fit()? A3: Without it you are training blind — you cannot detect overfitting as it happens and have no way to know if the model is truly learning.

Bonus Knowledge

There is a powerful trick called Early Stopping. Instead of manually picking how many epochs to train, you tell Keras to watch the validation loss automatically and stop training the moment it stops improving. This prevents overfitting and saves training time. It is like having a coach tap you on the shoulder and say — you have learned enough, any more practice is just building bad habits. In production AI systems at companies like Netflix and Flipkart, early stopping is used in almost every training pipeline to keep models sharp without wasting compute.

Key Takeaways

Key Takeaways

  • Training a neural network is like learning to ride a bicycle — each epoch makes the model a little less wrong than before.
  • Every epoch has four steps: forward pass, measure loss, backpropagate, and repeat.
  • Always split data into training and validation sets — the validation set is the unfamiliar road that proves real learning.
  • Watch both loss curves: training loss and validation loss dropping together means healthy training.
  • If validation loss rises while training loss drops, the model is overfitting — memorising, not learning.
  • Always pass validation_data to model.fit() — training without it is training completely blind.

← Previous Lesson