Skip to content
The loss curve

Hyperparameter

A number that controls training but isn't learned by the optimizer: learning rate, batch size, hidden size, dropout probability, etc. You set them; you don't gradient-descent them.