Glossary

AlphaGo

AlphaGo, developed by David Silver and colleagues at DeepMind, is the computer Go program that in March 2016 defeated Lee Sedol, holder of 18 international Go titles, by 4 games to 1 in a five-game match in Seoul. Computer Go had long been considered a benchmark of progress in AI; the game's enormous search space (many more legal positions than the Go board has atoms in the observable universe) had resisted decades of attempts. Most experts in 2014 had predicted that human-level Go play was a decade away.

AlphaGo combines three components: A policy network (a deep CNN) trained on 30 million human expert moves to predict the next move; A value network trained to predict the eventual game outcome from a board position; Monte Carlo Tree Search (MCTS) that uses the policy network to focus search on promising moves and the value network to evaluate non-terminal positions. The networks were further refined through reinforcement learning by self-play.

The match was watched by over 200 million viewers worldwide. Move 37 of game 2, a move at the 5th line that no human commentator had considered, and which experts initially thought was a mistake before its strategic value emerged, became immediately famous as a moment of unexpected machine creativity. Lee Sedol retired from professional Go in 2019, citing the realisation that AlphaGo's successors would only widen the gap as a factor.

AlphaGo Zero (2017) achieved superhuman performance from self-play alone, with no human game data. AlphaZero (2018) extended the same algorithm to chess and shogi: it surpassed the previous state of the art in chess after 4 hours of self-play, in shogi after 2 hours, and in Go after 30 hours (Silver et al., 2017). The lineage continues through MuZero, the AlphaCode programming-competition system, AlphaStar (StarCraft II) and AlphaFold.

Video

Related terms: david-silver, demis-hassabis, Reinforcement Learning

Discussed in:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).