17.8 Search and recommendation

Web search has been a major AI application since well before "AI" was a marketing term. Google's 2015 deployment of RankBrain, a deep-learning system for query interpretation, particularly long-tail queries, was an early production use of neural retrieval. The 2018 introduction of BERT, fine-tuned for query–document understanding, was rolled into Google's ranking pipeline in 2019 and into Bing's in 2019. Multistage retrieval remains the standard architecture: a fast candidate generator (often a sparse retrieval method like BM25 or a dense two-tower retrieval model with separate query and document encoders projecting to a shared embedding space) produces a few hundred candidates, which are then re-ranked by a more expensive cross-encoder model.

The 2022–2023 LLM wave reshaped search. Bing Chat (announced February 2023, branded Copilot from late 2023) integrated GPT-4 with web retrieval into a conversational interface. Perplexity AI launched in 2022 as a research-focused retrieval-augmented LLM and grew to over 100 million queries per week by 2025. Google's Search Generative Experience (SGE), announced May 2023 and launched as AI Overviews in May 2024, places an LLM-generated summary above the traditional ten blue links for many queries. The technical pattern across all these systems is retrieval-augmented generation: retrieve relevant documents using classical or learned IR methods, condition the LLM on the retrieved passages, and generate a grounded answer.

The economic impact on the open web is substantial and contested. Publishers report click-through-rate decreases for queries that produce AI summaries; some estimates put the reduction at 25–50% for affected query categories. The implications for the business model that has supported the open web, content funded by search-driven advertising, are unresolved at the time of writing.

Recommendation systems power most consumer software products. The matrix-factorisation and content-based-filtering frameworks of the 2000s and 2010s have largely been supplanted in industry by deep learning. The standard pattern in 2026, used by YouTube, Netflix, Spotify, TikTok and Amazon, is a multistage architecture:

  1. Retrieval (candidate generation): a two-tower model encodes user and item representations into a shared space; approximate nearest-neighbour search (FAISS, ScaNN, HNSW, IVF-PQ) returns a few thousand candidates from a corpus of hundreds of millions in single-digit milliseconds.
  2. Ranking: a richer model (gradient-boosted trees in some places, deep cross networks like DCN-V2 in others, transformers in the most modern stacks) scores the candidates using hundreds of features including user profile, item features, recent interaction history, time of day, device.
  3. Re-ranking: business rules apply diversity, freshness, exploration, and policy constraints (no extremist content, no rapid-repeat exposure to the same creator, etc.).

YouTube's recommendation system, described publicly in 2016 (Covington, Adams, Sargin, RecSys 2016) and refined since, drives over 70% of watch time. Netflix's Cinematch and successors save the company an estimated $1 billion per year in reduced churn. TikTok's For You algorithm, often described as a state-of-the-art transformer-based sequential recommender, has been the platform's defining differentiator. The Pixie graph-based recommender at Pinterest, the SASRec-style transformer at Spotify, and the various conversational recommender efforts at Amazon all sit in the same family.

The societal effects of recommendation systems are extensively debated. Engagement-optimised feeds tend to over-promote sensational, polarising and adversarial content because such content drives clicks. The EU's Digital Services Act, in force since February 2024, requires very large online platforms to explain their recommender logic, allow at least one non-profiling option, and conduct annual systemic-risk assessments. The US Section 230 framework has been challenged in cases including Gonzalez v. Google (2023) which the Supreme Court declined to use to revise platform liability. The technical and the regulatory frontier here is moving fast.

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).