Machine learning is the science of fitting predictive functions to data. Mitchell's $T$/$P$/$E$ framing makes any project concrete.
Three classical paradigms, supervised, unsupervised, reinforcement, plus self-supervised learning, the engine of foundation models.
The supervised setup is data, hypothesis class, loss, and risk. Empirical risk minimisation is the canonical learning principle.
Generalisation is the central problem. The bias–variance decomposition explains the U-shaped test-error curve. Capacity controls bias and variance.
Regularisation, L1, L2, dropout, early stopping, augmentation, restricts the hypothesis class to favour solutions that generalise.
No Free Lunch: no learner is universally best. The art is matching the inductive bias to the problem.
Cross-validation, properly nested, gives an honest estimate of generalisation. Random search, Bayesian optimisation, and Hyperband make hyperparameter tuning efficient.
The curse of dimensionality is real but tamed by the manifold hypothesis: real data lives on low-dimensional manifolds.
Modern deep networks live in the overparameterised regime, where double descent and the implicit bias of SGD give surprisingly good generalisation.
Metrics, accuracy, precision, recall, F1, AUC-ROC, AUC-PR, Brier, ECE, must be chosen for the deployment context. Calibration matters whenever probabilities feed into decisions.
Imbalanced classes are handled with reweighting, resampling, or threshold tuning. Re-calibrate after any of them.
Honest evaluation, single-use test sets, contamination analysis, multiple benchmarks, confidence intervals, is the practice that separates real progress from noise.
This site is currently in Beta. Contact: Chris Paton