The alternating assign-and-update loop converges three centres into clusters.
From the chapter: Chapter 8: Unsupervised Learning
Glossary: k means clustering, lloyd algorithm, kmeans plusplus
People: stuart lloyd
Transcript
k-means is the simplest unsupervised clustering algorithm. It partitions points into k groups by alternating two steps.
Sixty data points appear on the canvas, along with three random cluster centres. Initially, no point belongs to any cluster.
Step one. Each point is assigned to the nearest centre by Euclidean distance, and inherits the colour of that centre.
Step two. Each centre slides to the mean position of the points assigned to it. The faint lines show how the points pull their centre into place.
Repeat. Reassign every point, recompute every centre. The centres drift into the dense regions of data.
After a handful of iterations no assignments change and the centres come to rest. The algorithm has converged.
k-means always converges, but only to a local optimum. The result depends on the initial centres, so the algorithm is typically run several times and the best result kept.