Generative grammar, introduced by Noam Chomsky in his 1957 Syntactic Structures, is the project of describing the syntax of natural languages by formal recursive rules that can generate (in the mathematical sense, produce as their language) the set of all and only the grammatical sentences. The core formalism, the context-free grammar, consists of a set of rewrite rules that derive surface strings from a start symbol through successive substitutions of non-terminals.
The Chomsky hierarchy classifies grammars by the form of their rewrite rules: regular (recognised by finite automata), context-free (pushdown automata), context-sensitive (linear-bounded automata), unrestricted (Turing machines). Each strictly contains the previous. The hierarchy gave computer science its formal theory of programming-language syntax, almost every programming language has a context-free grammar specifying its syntax, and of parsing.
In linguistics Chomsky argued that natural-language syntax exceeds the expressive power of regular grammars and approximates context-free grammars with limited extensions. Successive frameworks, Standard Theory (1965), Government and Binding (1981), Minimalism (1995), refined the proposed innate human language faculty.
The relationship of generative grammar to modern statistical and neural approaches to NLP is contested. Chomsky has been a vocal critic of large-language-model approaches, arguing that they describe rather than explain linguistic competence. Empirically, statistical and neural methods now outperform every Chomsky-style parser ever built, but the formal hierarchy remains indispensable for compiler construction and the theoretical analysis of computation.
Related terms: noam-chomsky
Discussed in:
- Chapter 1: What Is AI?, A Brief History of AI