Data Leverage References

← Back to browse

Scaling Laws for Neural Language Models

2020 article kaplan2020scaling Not yet verified
Authors
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei
Venue
arXiv preprint
Abstract
Establishes power-law scaling relationships between language model performance and model size, dataset size, and compute, spanning seven orders of magnitude.

BibTeX

Local Entry
@article{kaplan2020scaling,
  title = {Scaling Laws for Neural Language Models},
  author = {Jared Kaplan and Sam McCandlish and Tom Henighan and Tom B. Brown and Benjamin Chess and Rewon Child and Scott Gray and Alec Radford and Jeffrey Wu and Dario Amodei},
  year = {2020},
  journal = {arXiv preprint},
  url = {https://arxiv.org/abs/2001.08361},
  abstract = {Establishes power-law scaling relationships between language model performance and model size, dataset size, and compute, spanning seven orders of magnitude.}
}
From OPENALEX
@article{kaplan2020scaling,
  title = {Scaling Laws for Neural Language Models},
  author = {Jared Kaplan and Sam McCandlish and Tom Henighan and T. B. Brown and Benjamin Chess and Rewon Child and Scott Gray and Alec Radford and Jeffrey Wu and Dario Amodei},
  year = {2020},
  journal = {arXiv (Cornell University)},
  doi = {10.48550/arxiv.2001.08361}
}