6.S191 · MIT · 2024

Introduction to Deep Learning

with Alexander Amini, Ava Amini

Official course page →

Your progress in this browser

Lectures · 0 / 8 watched

Quiz · 0 / 6 correct

Progress is stored in this browser only — there is no account, no login, and no database. Clearing your browser data will reset it.

About the course

MIT 6.S191 is a one-week intensive that has been refreshed every year since 2017 by Alexander and Ava Amini. The 2024 edition is the version we link to here. It is the fastest path from "what is a neural network" to running a transformer or a diffusion model: six core lectures of roughly an hour each, plus three to four guest lectures from industry researchers and PyTorch / TensorFlow lab sessions that you can run yourself in Colab.

The pacing is unforgiving — six lectures cover what CS229 covers in fifteen — but the trade-off is that you get to modern architectures (transformers, diffusion, RLHF) by the end of the week. The course is best treated as the second pass over the same material after you have read our neural networks, training-optimisation, CNNs, and sequence-models chapters. Do that, and the lectures will fit together as a sequence of "here is how we deploy it in 2024".

Watch the lectures

Open the full playlist on YouTube →

Syllabus

Tick lectures as you finish them. Your ticks live in this browser only.

  1. Alexander Amini

    What deep learning actually is — universal approximation, the perceptron, backpropagation, gradient descent, why scale matters.

  2. Ava Amini

    RNNs, LSTMs, the vanishing-gradient problem, and the transition to attention-based models.

  3. Alexander Amini

    Convolutional networks, feature maps, classical architectures, transfer learning, and modern hybrids with vision transformers.

  4. Ava Amini

    Autoencoders, variational autoencoders, GANs, and diffusion models. Likelihood-based vs implicit generative families.

  5. Alexander Amini

    Markov decision processes, Q-learning, policy gradients, deep RL. Application to game playing and robotics.

  6. Ava Amini

    From n-grams to transformers to GPT-4. Pretraining, fine-tuning, RLHF, in-context learning, hallucination.

  7. Ava Amini

    Adversarial examples, calibration, uncertainty quantification, bias, the practical limits of current models.

  8. Ava Amini

    AlphaFold and ESM. Sequence and structure models for proteins. Why biology was a natural fit for transformers.

Self-assessment

A short multi-choice quiz. Click an option to commit; the correct answer and an explanation appear. Your answers are remembered in this browser.

  1. Question 1. A standard feed-forward neural network with no hidden layers, sigmoid output, and cross-entropy loss is mathematically equivalent to:

  2. Question 2. The vanishing-gradient problem in deep networks is essentially:

  3. Question 3. A convolutional layer is preferred over a fully-connected layer for images mainly because:

  4. Question 4. A variational autoencoder differs from a plain autoencoder in that it:

  5. Question 5. In Q-learning, the Bellman update for $Q(s, a)$ after observing transition $(s, a, r, s')$ is:

  6. Question 6. In-context learning — a large language model solving a new task from a prompt with a few examples — is best described as:

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).