Data Leverage References

← Back to browse

The Pile: An 800GB Dataset of Diverse Text for Language Modeling

2021 article pile_paper ? Not found (auto)

Not indexed in the checked database. May be too new, non-academic, or use a different identifier.

Authors
Leo Gao, Stella Biderman, Sid Black, Laurence Golding, Travis Hoppe, Charles Foster, Jason Phang, Horace He, Anish Thite, Noa Nabeshima, Shawn Presser, Connor Leahy
Venue
CoRR

BibTeX

Local Entry
@article{pile_paper,
  title = {The Pile: An 800GB Dataset of Diverse Text for Language Modeling},
  author = {Leo Gao and Stella Biderman and Sid Black and Laurence Golding and Travis Hoppe and Charles Foster and Jason Phang and Horace He and Anish Thite and Noa Nabeshima and Shawn Presser and Connor Leahy},
  year = {2021},
  journal = {CoRR},
  url = {https://arxiv.org/abs/2101.00027},
  eprint = {2101.00027},
  archiveprefix = {arXiv},
  volume = {abs/2101.00027}
}
External Source

Not found in external databases.