Scaling law
Empirical observation that model quality improves predictably with parameters, training tokens, and compute. The Chinchilla rule: optimal tokens ≈ 20 × parameters.
Continue
Empirical observation that model quality improves predictably with parameters, training tokens, and compute. The Chinchilla rule: optimal tokens ≈ 20 × parameters.
Continue