Glossary

AlphaGeometry 2

AlphaGeometry (Trinh et al., DeepMind, Nature 2024) and its successor AlphaGeometry 2 are neuro-symbolic Olympiad-geometry provers that solve plane-geometry problems by combining a symbolic deductive engine with a neural network that proposes auxiliary constructions. AlphaGeometry 1 solved 25/30 IMO geometry problems from 2000–2022, on par with human gold-medalists; AlphaGeometry 2 was used in DeepMind's IMO 2024 silver-medal performance to solve problem P3 (geometry).

The architecture has two complementary halves.

The symbolic engine is a forward-chaining deductive system over a fixed library of geometric primitives, collinearity, concyclicity, perpendicularity, similar triangles, the standard repertoire of Olympiad geometry. Given the hypotheses of a problem, it derives all consequences exhaustively until either the goal is proved or no new facts can be derived. The engine is fast and complete within its primitive vocabulary, but limited: many geometry problems require a clever auxiliary construction (drop a perpendicular here, take the midpoint of that segment, draw a circle through these three points) to enable the deductive engine to reach the conclusion. Without that construction the symbolic engine is stuck.

The neural construction proposer is a transformer trained to suggest auxiliary constructions. When the deductive engine reaches a fixed point without proving the goal, the network is queried for the next promising construction; the engine resumes with the new point/line/circle added; the loop continues until proof or budget exhaustion. The network was trained from scratch on 100 million synthetic geometry problems generated by random configuration sampling and reverse-engineered solutions, no human-written training data at all.

The synthetic-data pipeline is the technical core. The team generated random configurations of points, lines and circles; ran the symbolic engine to derive all consequences; identified a non-trivial consequence as the "goal"; and recorded the random construction that had created the goal as the auxiliary-construction training target. This produced a clean (problem-without-construction → required-construction) supervised dataset at unbounded scale. The neural network learns a strong prior over which constructions tend to be useful from this dataset alone.

AlphaGeometry 2 improvements over the original include: (1) larger coverage in the symbolic engine, including ratio reasoning and locus arguments; (2) a much larger and more capable neural proposer based on Gemini; (3) a better synthetic-data generator producing problems closer to IMO difficulty; (4) integration with alphaproof for problems that span geometry and algebra.

The system is the cleanest architectural demonstration of neuro-symbolic AI working at the frontier. The symbolic engine handles the deductive work it does perfectly, the neural network handles the heuristic work where deduction is intractable, and they meet at the construction-proposal interface. Neither half could solve Olympiad geometry alone, the deductive engine is too narrow without constructions, the neural network is too imprecise without verification.

AlphaGeometry's training paradigm, pure synthetic data, deterministic verifier (verifiable-rewards), neuro-symbolic combination, has become a reference design for domain-specific reasoning systems. Variants are now appearing in formal verification, symbolic regression, and SAT solving. The headline lesson is that for narrow but well-formalised domains, a hybrid system can match human experts using no human training data at all.

Related terms: AlphaProof Internals, Synthetic Data for Reasoning, Verifiable Rewards, o1 / Reasoning Models, Self-Play on Verifiable Rewards

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).