Hierarchical Navigable Small World (HNSW) is an approximate-nearest-neighbour search algorithm introduced by Malkov and Yashunin (2018). It is the dominant graph-based ANN method in production retrieval systems, available natively in FAISS, pgvector, Milvus, Qdrant, Weaviate, and Elasticsearch.
Idea. Build a multi-layer graph that resembles a probabilistic skip list. The bottom layer (layer $0$) contains all $N$ database points connected to their close neighbours; each higher layer contains an exponentially decreasing fraction of points. Search begins at a single entry point in the top layer, greedily descends towards the query, and refines at the bottom layer. The hierarchical structure provides logarithmic-ish search complexity.
Construction. For each new point $\mathbf{x}$:
Sample the maximum layer for $\mathbf{x}$ from a geometric distribution with parameter $1/\ln M$: $$\ell = \lfloor -\ln(U[0,1]) \cdot \ln M \rfloor.$$ Most points get $\ell = 0$, a few get higher levels.
From the current entry point at the top, greedily traverse each layer until reaching layer $\ell + 1$, recording the closest node found.
From layer $\ell$ down to layer $0$, find the $\mathrm{efConstruction}$ closest points already in the graph, then connect $\mathbf{x}$ to up to $M$ of them per layer (with a heuristic that prefers diverse neighbours to avoid clustering artefacts).
Search. Given query $\mathbf{q}$:
- Start at the entry point on the top layer.
- At each layer, perform a greedy walk: examine the neighbours of the current node, move to the one closest to $\mathbf{q}$, repeat until no neighbour is closer.
- Drop down a layer and continue. At layer $0$, run a beam search of size $\mathrm{efSearch}$ to gather a candidate pool, then return the top $k$.
Hyperparameters.
- $M$: maximum out-degree per node (typical 16–48). Higher means larger memory and better recall.
- $\mathrm{efConstruction}$: candidate pool size during build (typical 100–500). Higher improves graph quality.
- $\mathrm{efSearch}$: candidate pool size at query time. Tunes the recall–latency trade-off.
Complexity. Empirical search time scales as $O(\log N)$ for moderate dimensions, with the constant growing with $d$. Memory is roughly $O(N \cdot M \cdot d)$ for storing both vectors and graph edges; this is HNSW's main weakness for very large indices, which is why IVF-PQ is preferred above $\sim 10^9$ items.
Theoretical underpinnings. The "navigable small world" property ensures short paths exist between any two nodes, while the probabilistic layering produces a decimation hierarchy reminiscent of skip lists. Together they give the algorithm its near-logarithmic behaviour in practice, despite no tight worst-case bound.
HNSW is a strong default for million-to-hundred-million-scale dense retrieval where memory is plentiful and high recall matters.
Video
Discussed in:
- Chapter 14: Generative Models, Retrieval