Skip to content
The loss curve

Probability distribution

A list of non-negative numbers that sum to 1, one per possible outcome. A language model's output is one such distribution over the vocabulary.