Skip to content
The loss curve

Tokenizer

The function that splits raw text into tokens. Ranges from naive whitespace + lowercase to learned schemes like BPE.