2.16 Further reading
For a beautifully axiomatic treatment of finite-dimensional linear algebra, Axler's Linear Algebra Done Right postpones determinants for as long as possible and emphasises operators on inner-product spaces. Hoffman and Kunze's Linear Algebra is a more traditional and complete reference. Strang's Linear Algebra and Its Applications is the classic engineering-flavoured introduction and has good chapters on least squares and SVD. Trefethen and Bau's Numerical Linear Algebra is the canonical text on the numerical side, every machine-learning practitioner should read its chapters on conditioning, QR, and SVD. Golub and Van Loan's Matrix Computations is the encyclopaedia. For the deep-learning angle specifically, Goodfellow, Bengio, and Courville's Deep Learning devotes a chapter to linear-algebra prerequisites; Chapter 2 of that book and this one cover similar ground at similar depth. The matrix-calculus identities collected in Petersen and Pedersen's Matrix Cookbook are an indispensable reference once you start deriving custom losses and gradients.