Skip to content
The loss curve

Token

The basic unit a model reads and writes. Often a subword, sometimes a word, sometimes a single character.

Tokenization is the first step in any language model pipeline. Whitespace tokenizers split on spaces; BPE tokenizers find a granularity between words and characters by learning which subwords appear most often.