Shared References

← Back to browse

Scaling Laws for Neural Language Models

2020 article kaplan2020scaling
Authors
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei
Venue
arXiv preprint
Abstract
Establishes power-law scaling relationships between language model performance and model size, dataset size, and compute, spanning seven orders of magnitude.