Linear algebra

If the thing is twice as big, the transformed version of the thing is also twice as big. {End}

2011-04-06 — 2024-01-26

Wherein the foundations of linear algebra are surveyed, and the singular value decomposition is invoked so that linear maps are exhibited as data-approximation operators, with a brief note on Moore–Penrose pseudoinverses.

algebra

functional analysis

Hilbert space

linear algebra

Oh! The hours I put into studying the taxonomy and husbandry of matrices. Time has passed. I have forgotten much. Jacobians have begun to seem downright Old Testament.

And when we put the various operations of matrix calculus into the mix (derivative of trace of a skew-Hermitian heffalump painted with a camel-hair brush), the combinatorial explosion of theorems and identities is intimidating.

Things I need:

1 Basic linear algebra intros

Jacob Ström, Kalle Åström, and Tomas Akenine-Möller, Immersive Math is “The world’s first linear algebra book with fully interactive figures.”
Kevin Brown on Bras, Kets, and Matrices
Stanford CS229’s Linear Algebra Review and reference (PDF)
Fun: Tom Leinster, There are no non-trivial complex quarter turns but there are real ones, i.e.

for a linear operator T on a real inner product space,

\[ \langle T x, x \rangle = 0 \,\, \forall x \,\, \iff \,\, T^\ast = -T \]

whereas for an operator on a complex inner product space,

\[ \langle T x, x \rangle = 0 \,\, \forall x \,\, \iff \,\, T = 0. \]

Cool.
Sheldon Axler’s Down with Determinants!. (Axler 1995) is a readable and intuitive introduction for undergrads:

Without using determinants, we will define the multiplicity of an eigenvalue and prove that the number of eigenvalues, counting multiplicities, equals the dimension of the underlying space. Without determinants, we’ll define the characteristic and minimal polynomials and then prove that they behave as expected. Next, we will easily prove that every matrix is similar to a nice upper-triangular one. Turning to inner product spaces, and still without mentioning determinants, we’ll have a simple proof of the finite-dimensional Spectral Theorem.

Determinants are needed in one place in the undergraduate mathematics curriculum: the change of variables formula for multi-variable integrals. Thus at the end of this paper we’ll revive determinants, but not with any of the usual abstruse definitions. We’ll define the determinant of a matrix to be the product of its eigenvalues (counting multiplicities). This easy-to-remember definition leads to the usual formulas for computing determinants. We’ll derive the change of variables formula for multi-variable integrals in a fashion that makes the appearance of the determinant there seem natural.

He wrote a whole textbook on this basis, (Axler 2014).
a handy glossary is Mike Brooks’ Matrix reference manual
Singular Value Decomposition series, for its insight:

Most of the time when people talk about linear algebra (even mathematicians), they’ll stick entirely to the linear map perspective or the data perspective, which is kind of frustrating when you’re learning it for the first time. It seems like the data perspective is just a tidy convenience, that it just “makes sense” to put some data in a table. In my experience the singular value decomposition is the first time that the two perspectives collide, and (at least in my case) it comes with cognitive dissonance.
Nigh Higham presents the Moore-Penrose pseudoinverse as a member of a family of pseudoinverses, Updated

2 Linear algebra and calculus

The multidimensional statistics/control theory workhorse.

See matrix calculus.

3 Multilinear Algebra

Oooh you are playing with tensors? I don’t have a bunch to say but here is a compact explanation of Einstein summation, which turns out to be as simple as it needs to be, but no simpler.

4 Fun tricks

John Cook on Sam Walter’s theorem on convex functions of eigenvalues and diagonals.

5 Eigenvectors-from-eigenvalues

ritchieng/eigenvectors-from-eigenvalues: PyTorch implementation comparison of old and new method of determining eigenvectors from eigenvalues. (Denton et al. 2022)

6 Incoming

Intro to Lattices Continued: Matrices!
James Propp, What is a Matrix? is a beautiful introduction to matrices via Fibonacci and Cayley and ancient Chinese arithmetic

7 References

Axler. 1995. “Down with Determinants!” The American Mathematical Monthly.

———. 2014. Linear Algebra Done Right.

Bamieh. 2022. “A Tutorial on Matrix Perturbation Theory (Using Compact Matrix Notation).”

Charlier, Feydy, Glaunès, et al. 2021. “Kernel Operations on the GPU, with Autodiff, Without Memory Overflows.” Journal of Machine Learning Research.

Dahleh, Dahleh, and Verghese. 1990. “Matrix Perturbations.” In Lectures on Dynamic Systems and Control.

Denton, Parke, Tao, et al. 2022. “Eigenvectors from Eigenvalues: A Survey of a Basic Identity in Linear Algebra.” Bulletin of the American Mathematical Society.

Dwyer. 1967. “Some Applications of Matrix Derivatives in Multivariate Analysis.” Journal of the American Statistical Association.

Gallier, and Quaintance. 2022. Algebra, Topology, Diﬀerential Calculus, and Optimization Theory For Computer Science and Machine Learning.

Giles, Mike B. 2008. “Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation.” In Advances in Automatic Differentiation.

Giles, M. 2008. “An Extended Collection of Matrix Derivative Results for Forward and Reverse Mode Automatic Differentiation.” Http://Eprints.maths.ox.ac.uk/1079.

Golub, and Meurant. 2010. Matrices, Moments and Quadrature with Applications.

Golub, and van Loan. 1983. Matrix Computations.

Graham. 1981. Kronecker Products and Matrix Calculus: With Applications.

Higham. 2008. Functions of Matrices: Theory and Computation.

Hoaglin, and Welsch. 1978. “The Hat Matrix in Regression and ANOVA.” The American Statistician.

Laue, Mitterreiter, and Giesen. 2018. “Computing Higher Order Derivatives of Matrix and Tensor Expressions.” In Advances in Neural Information Processing Systems 31.

Magnus, and Neudecker. 2019. Matrix differential calculus with applications in statistics and econometrics. Wiley series in probability and statistics.

Mahoney. 2010. Randomized Algorithms for Matrices and Data.

Minka. 2000. Old and new matrix algebra useful for statistics.

Parlett. 2000. “The QR Algorithm.” Computing in Science & Engineering.

Petersen, and Pedersen. 2012. “The Matrix Cookbook.”

Rellich. 1954. Perturbation theory of eigenvalue problems.

Saad. 2003. Iterative Methods for Sparse Linear Systems: Second Edition.

Seber. 2007. A Matrix Handbook for Statisticians.

Simoncini. 2016. “Computational Methods for Linear Matrix Equations.” SIAM Review.

Steeb. 2006. Problems and Solutions in Introductory and Advanced Matrix Calculus.

Turkington. 2002. Matrix Calculus and Zero-One Matrices: Statistical and Econometric Applications.