References
Scaling Laws for Neural Language Models
Jared Kaplan , Sam McCandlish, Tom Henighan, Tom B. Brown , Benjamin Chess, Rewon Child, Scott Gray, Alec Radford , Jeffrey Wu, & Dario Amodei (2020)
arXiv .
DOI: https://doi.org/10.48550/arxiv.2001.08361
Abstract. Establishes empirical scaling laws for
language model performance as a smooth power-law function of parameters, dataset size, and compute. The paper motivated the training of ever-larger models by demonstrating predictable returns on scale.
Tags: scaling language-models
Previous Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, & Shane Legg (2018)
Next Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, & William Fedus (2022)
This site is currently in Beta. Contact: Chris Paton
Textbook of Usability · Textbook of Digital Health
Auckland Maths and Science Tutoring
AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).