Introduction to Deep Learning, Courses, Textbook of AI

6.S191 · MIT · 2024

Introduction to Deep Learning

with Alexander Amini, Ava Amini

Your progress in this browser

Lectures · 0 / 8 watched

Quiz · 0 / 6 correct

Progress is stored in this browser only — there is no account, no login, and no database. Clearing your browser data will reset it.

About the course

MIT 6.S191 is a one-week intensive that has been refreshed every year since 2017 by Alexander and Ava Amini. The 2024 edition is the version we link to here. It is the fastest path from "what is a neural network" to running a transformer or a diffusion model: six core lectures of roughly an hour each, plus three to four guest lectures from industry researchers and PyTorch / TensorFlow lab sessions that you can run yourself in Colab.

The pacing is unforgiving — six lectures cover what CS229 covers in fifteen — but the trade-off is that you get to modern architectures (transformers, diffusion, RLHF) by the end of the week. The course is best treated as the second pass over the same material after you have read our neural networks, training-optimisation, CNNs, and sequence-models chapters. Do that, and the lectures will fit together as a sequence of "here is how we deploy it in 2024".

Watch the lectures

Open the full playlist on YouTube →

Syllabus

Tick lectures as you finish them. Your ticks live in this browser only.

1. Introduction to Deep Learning
Alexander Amini

What deep learning actually is — universal approximation, the perceptron, backpropagation, gradient descent, why scale matters.
2. Deep Sequence Modelling
Ava Amini

RNNs, LSTMs, the vanishing-gradient problem, and the transition to attention-based models.
3. Deep Computer Vision
Alexander Amini

Convolutional networks, feature maps, classical architectures, transfer learning, and modern hybrids with vision transformers.
4. Deep Generative Modelling
Ava Amini

Autoencoders, variational autoencoders, GANs, and diffusion models. Likelihood-based vs implicit generative families.
5. Reinforcement Learning
Alexander Amini

Markov decision processes, Q-learning, policy gradients, deep RL. Application to game playing and robotics.
6. Language Models & Large Language Models
Ava Amini

From n-grams to transformers to GPT-4. Pretraining, fine-tuning, RLHF, in-context learning, hallucination.
7. Robust and Trustworthy Deep Learning
Ava Amini

Adversarial examples, calibration, uncertainty quantification, bias, the practical limits of current models.
8. Deep Learning for Biology
Ava Amini

AlphaFold and ESM. Sequence and structure models for proteins. Why biology was a natural fit for transformers.

Self-assessment

A short multi-choice quiz. Click an option to commit; the correct answer and an explanation appear. Your answers are remembered in this browser.

Question 1. A standard feed-forward neural network with no hidden layers, sigmoid output, and cross-entropy loss is mathematically equivalent to:
Question 2. The vanishing-gradient problem in deep networks is essentially:
Question 3. A convolutional layer is preferred over a fully-connected layer for images mainly because:
Question 4. A variational autoencoder differs from a plain autoencoder in that it:
Question 5. In Q-learning, the Bellman update for $Q(s, a)$ after observing transition $(s, a, r, s')$ is:
Question 6. In-context learning — a large language model solving a new task from a prompt with a few examples — is best described as:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).