Skip to content
The loss curve

Perplexity

Geometric mean of inverse-probability over a sequence. Low when the model assigns high probability to the tokens it sees; infinite when any one token has probability zero.

Reported as exp(mean negative log-likelihood). Common evaluation metric for language models. A perplexity of 50 means the model is, on average, as confused as if it had to choose uniformly between 50 equally-likely tokens.