Summary

  • Machine learning is the science of fitting predictive functions to data. Mitchell's $T$/$P$/$E$ framing makes any project concrete.
  • Three classical paradigms, supervised, unsupervised, reinforcement, plus self-supervised learning, the engine of foundation models.
  • The supervised setup is data, hypothesis class, loss, and risk. Empirical risk minimisation is the canonical learning principle.
  • Generalisation is the central problem. The bias–variance decomposition explains the U-shaped test-error curve. Capacity controls bias and variance.
  • Regularisation, L1, L2, dropout, early stopping, augmentation, restricts the hypothesis class to favour solutions that generalise.
  • No Free Lunch: no learner is universally best. The art is matching the inductive bias to the problem.
  • Cross-validation, properly nested, gives an honest estimate of generalisation. Random search, Bayesian optimisation, and Hyperband make hyperparameter tuning efficient.
  • The curse of dimensionality is real but tamed by the manifold hypothesis: real data lives on low-dimensional manifolds.
  • Modern deep networks live in the overparameterised regime, where double descent and the implicit bias of SGD give surprisingly good generalisation.
  • Metrics, accuracy, precision, recall, F1, AUC-ROC, AUC-PR, Brier, ECE, must be chosen for the deployment context. Calibration matters whenever probabilities feed into decisions.
  • Imbalanced classes are handled with reweighting, resampling, or threshold tuning. Re-calibrate after any of them.
  • Honest evaluation, single-use test sets, contamination analysis, multiple benchmarks, confidence intervals, is the practice that separates real progress from noise.

This site is currently in Beta. Contact: Chris Paton

Textbook of Usability · Textbook of Digital Health

Auckland Maths and Science Tutoring

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).