Abstract. Argues that many reported emergent capabilities of LLMs are artefacts of nonlinear or discontinuous evaluation metrics. When the same tasks are scored with smooth metrics, emergent jumps typically dissolve into smooth, predictable improvements with scale.
Tags:scalinglanguage-modelsemergence
This site is currently in Beta. Contact: Chris Paton