Deep Double Descent: Where Bigger Models and More Data Hurt
Authors
Venue
ICLR 2020
Abstract
Demonstrates that double descent occurs across model size, training epochs, and dataset size in modern deep networks. Introduces effective model complexity to unify these phenomena and shows regimes where more data hurts.
Tags
Links
BibTeX
Local Entry
@inproceedings{nakkiran2020doubledescent,
title = {Deep Double Descent: Where Bigger Models and More Data Hurt},
author = {Preetum Nakkiran and Gal Kaplun and Yamini Bansal and Tristan Yang and Boaz Barak and Ilya Sutskever},
year = {2020},
booktitle = {ICLR 2020},
url = {https://arxiv.org/abs/1912.02292},
abstract = {Demonstrates that double descent occurs across model size, training epochs, and dataset size in modern deep networks. Introduces effective model complexity to unify these phenomena and shows regimes where more data hurts.}
} From OPENALEX
@inproceedings{nakkiran2020doubledescent,
title = {Deep double descent: where bigger models and more data hurt*},
author = {Preetum Nakkiran and Gal Kaplun and Yamini Bansal and Tristan Yang and Boaz Barak and Ilya Sutskever},
year = {2021},
booktitle = {Journal of Statistical Mechanics Theory and Experiment},
doi = {10.1088/1742-5468/ac3a74}
}