Explainable AI, Glossary, Textbook of AI

Also known as: XAI, interpretable AI

Explainable AI (XAI) is concerned with making the internal workings and outputs of machine learning models comprehensible to human stakeholders. The need arises from a fundamental tension: the most accurate models, deep neural networks, large ensembles, transformers, are also the most opaque. A linear regression with ten coefficients is directly inspectable; a neural network with hundreds of millions of parameters is not. When such a model denies a loan, flags a patient for disease, or recommends parole, affected individuals and decision-makers have a legitimate interest in understanding why.

Approaches divide into intrinsic and post-hoc explainability. Intrinsically interpretable models, decision trees, rule lists, generalised additive models, sparse linear classifiers, are designed so their logic is directly readable. Cynthia Rudin argues that in high-stakes domains one should prefer interpretable models outright rather than explaining opaque ones after the fact. Post-hoc methods provide explanations for black-box models. LIME perturbs the input around a point and fits a simple local surrogate model. SHAP uses Shapley values from cooperative game theory to assign each feature its average marginal contribution. Both are model-agnostic.

For deep networks, saliency maps compute the gradient of the output with respect to each input feature, highlighting influential pixels. Integrated Gradients refines this by accumulating attributions along a path from a baseline. Attention visualisation shows which tokens a transformer attends to, though attention weights do not necessarily reflect causal contribution. Concept-based explanations (TCAV) operate at higher levels of abstraction. Despite progress, XAI remains a contested field: explanations can be misleading, user studies show they sometimes increase trust without improving decisions, and the precise meaning of a "good explanation" depends heavily on the user and context.

Discussed in:

Chapter 16: Ethics & Safety, Explainable AI

This site is currently in Beta. Please get in touch via chrispaton.org with any suggestions, questions or comments.