Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, & Guillaume Lample (2023)
arXiv.
DOI: https://doi.org/10.48550/arxiv.2302.13971
Abstract. Introduces LLaMA, a family of open-weight foundation language models trained on publicly available data. LLaMA demonstrated that smaller models trained on more tokens can match or exceed much larger models, refining Chinchilla-style scaling conclusions.
Tags: transformer llama language-models open-source