Skip to content
The loss curve

Validation set

Held-out portion of the data the model never sees during training. Used to estimate metrics like perplexity on data the model can't have memorized.

A typical split is 80% train / 10% validation / 10% test. Validation drives decisions during development (which hyperparameters? when to stop?); test is touched only once at the end to report a final number.