Learning Curves Interactive Tool | Dr. Pedram Jahangiry

Do We Need to Collect More Data?

A learning curve plots model performance (error) against training set size. As we add more training data, the training error typically increases (harder to fit more points perfectly) while the validation error typically decreases (more data = better generalization).

The gap between the two curves tells you everything: a small gap at high error = high bias (underfitting), a large gap = high variance (overfitting), and both curves converging at low error = just right.

High Bias (Underfitting)

Both curves converge to a high error. The gap between them is small. Adding more data won't help — the model is too simple. Fix: increase model complexity, add features.

High Variance (Overfitting)

Training error is low but validation error is high. The gap between them is large. Adding more data can help — it gives the model less room to memorize. Fix: more data, reduce complexity, regularize.

The Learning Curve

Do We Need to Collect More Data?

High Bias (Underfitting)

High Variance (Overfitting)