Vocabulary
The set of all distinct tokens a model can read or produce. Size ranges from ~80 (character-level) to ~100,000 (modern subword tokenizers).
Continue
The set of all distinct tokens a model can read or produce. Size ranges from ~80 (character-level) to ~100,000 (modern subword tokenizers).
Continue