Deep Learning, Glossary, Textbook of AI

Also known as: DL

Deep Learning is the branch of machine learning that uses artificial neural networks with many layers (hence "deep") to learn hierarchical representations of data. Each layer transforms its input into a progressively more abstract representation: early layers might detect edges in an image, middle layers combine edges into parts, and deeper layers recognise whole objects. The defining advantage of deep learning is that these features are learned automatically from raw data, eliminating the need for hand-engineered features that dominated earlier machine learning.

The modern era of deep learning began around 2006 with advances in training algorithms and accelerated dramatically after 2012 when a deep convolutional network (AlexNet) won the ImageNet competition by a startling margin. Three converging forces enabled this: vast quantities of digital data, GPUs capable of the massive parallel matrix multiplications neural networks require, and algorithmic innovations including ReLU activations, dropout, batch normalisation, and residual connections.

Deep learning now dominates computer vision, natural language processing, speech recognition, game playing, and protein structure prediction. Its flexibility, the same basic building blocks (matrix multiplication plus nonlinearity) can be composed into architectures for almost any modality, has made it the most general-purpose learning paradigm yet discovered. Deep learning is a strict subset of ML, which is in turn a subset of AI: the three are best visualised as concentric circles.

Video

Discussed in:

Chapter 9: Neural Networks, 9.5 Network Architectures

This site is currently in Beta. Please get in touch via chrispaton.org with any suggestions, questions or comments.