# Matrix inverses

August 5, 2014 — September 29, 2023

### Assumed audience:

People with undergrad linear algebra

Matrix inverses. Sometimes we need them. Worse, sometimes we need generalizations of them, inverses of singular (a.k.a. non-invertible) matrices. This page is for notes on that theme.

But first, please fill out this form.

**✅ By reading further I acknowledge that a matrix inverse is actually what I want, even though the more common case would be that I would like the solution of a linear system without forming the inverse.**

OK, done? Let us continue.

## 1 All possible generalizations of matrix inverses

Many (Rao and Mitra 1973; Ben-Israel and Greville 2003). Nick Higham’s introduction generalized inverses is an easier start.

A beautiful, compact summary is in Searle (2014):

…there are (with one exception) many matrices \(\mathrm{G}\) satisfying \[ A G A=\mathrm{A} \text {, } \] which is condition (i). Each matrix \(\mathrm{G}\) satisfying AGA = A is called a generalized inverse of A, and if it also satisfies \(\mathrm{GAG}=\mathrm{G}\) it is a reflexive generalized inverse. The exception is when \(\mathrm{A}\) is nonsingular: there is then only one \(\mathrm{G}\); namely, \(\mathrm{G}=\mathrm{A}^{-1}\). 8.2 Arbitrariness That there are many matrices \(\mathrm{G}\) can be illustrated by showing ways in which from one \(\mathrm{G}\) others can be obtained. Thus, if A is partitioned as \[ \mathrm{A}=\left[\begin{array}{ll} \mathrm{A}_{11} & \mathrm{A}_{12} \\ \mathrm{A}_{21} & \mathrm{A}_{22} \end{array}\right] \] where \(\mathrm{A}_{11}\) is nonsingular with the same rank as \(\mathrm{A}\), then \[ \mathrm{G}=\left[\begin{array}{lllll} \mathrm{A}_{11}^{-1}-\mathrm{U A}_{21} & \mathrm{A}_{11}^{-1}-\mathrm{A}_{11}^{-1} & \mathrm{A}_{12} \mathrm{V}-\mathrm{A}_{11}^{-1} \mathrm{A}_{12} \mathrm{W A}_{21} \mathrm{A}_{11}^{-1} & \mathrm{U} \\ \mathrm{V} & & \mathrm{W} \end{array}\right] \] is a generalized inverse of \(\mathrm{A}\) for any values of \(\mathrm{U}, \mathrm{V}\) and \(\mathrm{W}\). This can be used to show that a generalized inverse of a symmetric matrix is not necessarily symmetric; and that of a singular matrix is not necessarily singular.

A simpler illustration of arbitrariness is that if \(\mathrm{G}\) is a generalized inverse of \(\mathrm{A}\) then so is \[ \mathrm{G}^*=\mathrm{GAG}+(\mathrm{I}-\mathrm{GA}) \mathrm{S}+\mathrm{T}(\mathrm{I}-\mathrm{AG}), \] for any values of \(\mathrm{S}\) and \(\mathrm{T}\).

## 2 Moore-Penrose pseudo-inverse

A classic. The “default” generalized inverse.

See Nick Higham’s What Is the Pseudoinverse of a Matrix?

Let us do the conventional thing and mention which properties of the pseudo-inverse are shared by the inverse:

The pseudo-inverse (or Moore-Penrose inverse) of a matrix \(\mathrm{A}\) is the matrix \(\mathrm{A}^{+}\) that fulfils

- \(\mathrm{A A}^{+} \mathrm{A}=\mathrm{A}\)
- \(\mathrm{A}^{+} \mathrm{A} \mathrm{A}^{+}=\mathrm{A}^{+}\)
- \(\mathrm{A A}^{+}\) symmetric
- \(\mathrm{A}^{+} \mathrm{A}\) symmetric

Want more? As always, we copy-paste the omnibus results of Petersen and Pedersen (2012):

Assume \(\mathrm{A}^{+}\) to be the pseudo-inverse of \(\mathrm{A}\), then \[ \begin{aligned} \left(\mathrm{A}^{+}\right)^{+} & =\mathrm{A} \\ \left(\mathrm{A}^\top\right)^{+} & =\left(\mathrm{A}^{+}\right)^\top \\ \left(\mathrm{A}^\dagger\right)^{+} & =\left(\mathrm{A}^{+}\right)^\dagger \\ \left(\mathrm{A}^{+} \mathrm{A}\right) \mathrm{A}^\dagger & =\mathrm{A}^\dagger \\ \left(\mathrm{A}^{+} \mathrm{A}\right) \mathrm{A}^\top & \neq \mathrm{A}^\top \\ (c \mathrm{A})^{+} & =(1 / c) \mathrm{A}^{+} \\ \mathrm{A}^{+} & =\left(\mathrm{A}^\top \mathrm{A}\right)^{+} \mathrm{A}^\top \\ \mathrm{A}^{+} & =\mathrm{A}^\top\left(\mathrm{A} \mathrm{A}^\top\right)^{+} \\ \left(\mathrm{A}^\top \mathrm{A}\right)^{+} & =\mathrm{A}^{+}\left(\mathrm{A}^\top\right)^{+} \\ \left(\mathrm{A} \mathrm{A}^\top\right)^{+} & =\left(\mathrm{A}^\top\right)^{+} \mathrm{A}^{+} \\ \mathrm{A}^{+} & =\left(\mathrm{A}^\dagger \mathrm{A}\right)^{+} \mathrm{A}^\dagger \\ \mathrm{A}^{+} & =\mathrm{A}^\dagger\left(\mathrm{A} \mathrm{A}^\dagger\right)^{+} \\ \left(\mathrm{A}^\dagger \mathrm{A}\right)^{+} & =\mathrm{A}^{+}\left(\mathrm{A}^\dagger\right)^{+} \\ \left(\mathrm{A} \mathrm{A}^\dagger\right)^{+} & =\left(\mathrm{A}^\dagger\right)^{+} \mathrm{A}^{+} \\ (\mathrm{A B})^{+} & =\left(\mathrm{A}^{+} \mathrm{A B}\right)^{+}\left(\mathrm{A B B} \mathrm{B}^{+}\right)^{+} \\ f\left(\mathrm{A}^\dagger \mathrm{A}\right)-f(0) \mathrm{I} & =\mathrm{A}^{+}\left[f\left(\mathrm{A} \mathrm{A}^\dagger\right)-f(0) \mathrm{I}\right] \mathrm{A} \\ f\left(\mathrm{A} \mathrm{A}^\dagger\right)-f(0) \mathrm{I} & =\mathrm{A}^\dagger\left[\left(\mathrm{A}^\dagger \mathrm{A}\right)-f(0) \mathrm{I}\right] \mathrm{A}^{+} \end{aligned} \]

I find definition in terms of these properties totally confusing. Perhaps a better way of thinking about pseudo-inverses is via their action upon vectors. TBC

Consider the Moore-Penrose of \(\mathrm{K}\) which we write \(\mathrm{K}^+\). The famous way of constructing it is by taking a SVD of \(\mathrm{K}=\mathrm{U}_{\mathrm{K}}\mathrm{S}_{\mathrm{K}}\mathrm{V}_{\mathrm{K}}^{\dagger}\) where \(\mathrm{U}_{\mathrm{K}}\) and \(\mathrm{V}_{\mathrm{K}}\) are unitary and \(\mathrm{S}_{\mathrm{K}}\) is diagonal. Then we define \(\mathrm{S}_{\mathrm{K}}^+\) to be the pseudo-inverse of the diagonal matrix of singular values, which we construct by setting all non-zero entries to their own reciprocal, but otherwise leave at 0. We have sneakily decided that pseudo-inverse of a diagonal matrix is easy; we just take the reciprocal of the non-zero entries. This turns out to do the right thing, if you check it, and it does not even sound crazy, but also it is not, to me at least, totally obvious. Next, the pseudo-inverse of the whole thing is \(\mathrm{K}^+=\mathrm{V}_{\mathrm{K}}\mathrm{S}_{\mathrm{K}}^+\mathrm{U}_{\mathrm{K}}^{\dagger}\), we claim. If we check the object we create by this procedure, we discover that it satisfies the above properties. (Homework problem).

Meta note: In general, proving things about pseudo-inverses by the constructive solution given by the SVD is much more compact than via the algebraic properties, as well as more intuitive, at least for me.

There is a cute special-case result for low rank matrices.

## 3 (Generalized) Bott-Duffin pseudo-inverse

Many variants (Wu et al. 2023; Gao et al. 2023; Liu, Wang, and Wei 2009).

## 4 Drazin inverse

The Drazin inverse is introduced in a mediocre Wikipedia article:

Let \(\mathrm{A}\) be square matrix. The index of \(\mathrm{A}\) is the least nonnegative integer \(k\) such that \(\operatorname{rank}\left(\mathrm{A}^{k+1}\right)=\operatorname{rank}\left(\mathrm{A}^k\right)\). The Drazin inverse of \(\mathrm{A}\) is the unique matrix \(\mathrm{A}^{\mathrm{D}}\) that satisfies \[ \mathrm{A}^{k+1} \mathrm{A}^{\mathrm{D}}=\mathrm{A}^k, \quad \mathrm{A}^{\mathrm{D}} \mathrm{A} \mathrm{A}^{\mathrm{D}}=\mathrm{A}^{\mathrm{D}}, \quad \mathrm{A} \mathrm{A}^{\mathrm{D}}=\mathrm{A}^{\mathrm{D}} \mathrm{A} . \] It’s not a generalized inverse in the classical sense, since \(\mathrm{A} \mathrm{A}^{\mathrm{D}} \mathrm{A} \neq \mathrm{A}\) in general.

- If \(\mathrm{A}\) is invertible with inverse \(\mathrm{A}^{-1}\), then \(\mathrm{A}^{\mathrm{D}}=\mathrm{A}^{-1}\).
- If \(\mathrm{A}\) is a block diagonal matrix
\[ \mathrm{A}=\left[\begin{array}{cc} \mathrm{B} & 0 \\ 0 & \mathrm{N} \end{array}\right] \] where \(\mathrm{B}\) is invertible with inverse \(\mathrm{B}^{-1}\) and \(\mathrm{N}\) is a nilpotent matrix, then \[ \mathrm{A}^D=\left[\begin{array}{cc} \mathrm{B}^{-1} & 0 \\ 0 & 0 \end{array}\right] \]

- Drazin inversion is invariant under conjugation. If \(\mathrm{A}^{\mathrm{D}}\) is the Drazin inverse of \(\mathrm{A}\), then \(P \mathrm{A}^{\mathrm{D}} P^{-1}\) is the Drazin inverse of \(P \mathrm{A} P^{-1}\).
- The Drazin inverse of a matrix of index 0 or 1 is called the group inverse or \(\{1,2,5\}\)-inverse and denoted \(\mathrm{A}^{\#}\). The group inverse can be defined, equivalently, by the properties \(\mathrm{A} \mathrm{A}^{\#} \mathrm{A}=\mathrm{A}, \mathrm{A}^{\#} \mathrm{A} \mathrm{A}^{\#}=\mathrm{A}^{\#}\), and \(\mathrm{A} \mathrm{A}^{\#}=\mathrm{A}^{\#} \mathrm{A}\).
- A projection matrix \(P\), defined as a matrix such that \(P^2=P\), has index 1 (or 0) and has Drazin inverse \(P^{\mathrm{D}}=P\).
- If \(\mathrm{\mathrm{A}}\) is a nilpotent matrix (for example a shift matrix), then \(\mathrm{A}^{\mathrm{D}}=0\).

## 5 Incrementally updating

## 6 Low-rank

See low-rank matrices.

## 7 Nearly-Low-rank

## 8 References

*Applied Mathematics and Computation*.

*Generalized Inverses: Theory and Applications*.

*Transactions of the American Mathematical Society*.

*Linear Algebra and Its Applications*, Special Issue on Large Scale Linear and Nonlinear Eigenvalue Problems,.

*SIAM Journal on Scientific Computing*.

*Inverse Problems*.

*Applied Mathematics and Computation*.

*Linear and Multilinear Algebra*.

*Boundary Element Methods for Engineers and Scientists: An Introductory Course with Advanced Topics*.

*Matrices, Moments and Quadrature with Applications*.

*SIAM Review*.

*The Annals of Statistics*.

*Journal of the American Statistical Association*.

*SIAM Review*.

*Sankhyā: The Indian Journal of Statistics, Series A (1961-2002)*.

*Numerical Linear Algebra with Applications*.

*Mathematics Magazine*.

*SIAM Journal on Applied Mathematics*.

*Wiley StatsRef: Statistics Reference Online*.

*Matrix Algebra Useful for Statistics*.

*Filomat*.

*SIAM Journal on Scientific Computing*.

*SIAM Journal on Scientific Computing*.

*Journal of Mathematics*.

*Linear Algebra and Its Applications*.