# Low-rank matrices

### Assumed audience: Here is a useful form that that some matrix might possess: $\mathrm{K}= \mathrm{Z} \mathrm{Z}^{\dagger}$ where $$\mathrm{K}\in\mathbb{R}^{N\times N}$$, $$\mathrm{Z}\in\mathbb{R}^{N\times D}$$ with $$D\ll N$$. Such matrices clearly Hermitian, and arise in, e.g. covariance estimation. I write $$\mathrm{Z}^{\dagger}$$ for the conjugate transpose of $$\mathrm{Z}$$ and use that here because sometimes I want to think of $$\mathrm{Z}$$ as a complex matrix, and last I checked, most stuff here generalised to that case easily under conjugate transposition. YMMV.

We call matrices of this form low-rank. Their cousins, the low-rank-plus-diagonal matrices, are also useful.

Here are some minor results about them that I needed to write down somewhere.

## Pseudo-inverses

Since $$D\ll N$$ the matrices in question here are trivially singular and so have no inverses. But they might have a pseudo inverse or some kind of generalised inverse. Is that low-rank? It can be.

Consider the Moore-Penrose pseudo inverse of $$\mathrm{K}$$ which we write $$\mathrm{K}^+$$. The famous way of constructing it for general $$\mathrm{K}$$ is by taking a SVD of $$\mathrm{K}=\mathrm{U}_{\mathrm{K}}\mathrm{S}_{\mathrm{K}}\mathrm{V}_{\mathrm{K}}^{\dagger}$$ where $$\mathrm{U}_{\mathrm{K}}$$ and $$\mathrm{V}_{\mathrm{K}}$$ are unitary and $$\mathrm{S}_{\mathrm{K}}$$ is diagonal. Then we define $$\mathrm{S}_{\mathrm{K}}^+$$ to be the pseudo-inverse of the diagonal matrix of singular values, which we construct by setting all non-zero entries to their own reciprocal, but otherwise leave at 0. We have sneakily decided that pseudo-inverse of a diagonal matrix is easy; we just take the reciprocal of the non-zero entries. This turns out to do the right thing, if you check it, and it does not even sound crazy, but also it is not, to me at least, totally obvious. Next, the pseudo-inverse of the whole thing is $$\mathrm{K}^+=\mathrm{V}_{\mathrm{K}}\mathrm{S}_{\mathrm{K}}^+\mathrm{U}_{\mathrm{K}}^{\dagger}$$, we claim. If we check the object we create by this procedure, we discover that it satisfies the right properties.

Can we find a constructive form for a low-rank matrix? Let’s try taking the SVD of the associated low rank factors and see what happens. Let $$\mathrm{Z}=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}\mathrm{V}_{\mathrm{Z}}^{\dagger}$$ be the SVD of $$\mathrm{Z}$$ so that $$\mathrm{U}_{\mathrm{Z}}$$ and $$\mathrm{V}_{\mathrm{Z}}$$ are unitary and $$\mathrm{S}_{\mathrm{Z}}$$ is diagonal. Then $$\mathrm{K}=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{V}_{\mathrm{Z}}^{\dagger}\mathrm{V}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger}=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger}$$.

Next, the pseudo-inverse of $$\mathrm{Z}^+=\mathrm{V}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger}$$ (the SVD construction).

Checking the matrix cookbook , we see that \begin{aligned} (\mathrm{Z}\mathrm{Z}^\dagger)^+ &=\mathrm{Z}^{+\dagger}\mathrm{Z}^{+} \end{aligned} so \begin{aligned} \mathrm{Z}^{+^\dagger}\mathrm{Z}^+ &=(\mathrm{V}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger})^{\dagger}(\mathrm{V}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger})\\ &=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{V}_{\mathrm{Z}}^{\dagger}\mathrm{V}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger}\\ &=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger} \end{aligned} It looks like we should be taking $$\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+$$ to be the low-rank factor of the pseudo-inverse of $$\mathrm{Z}$$, right?

We might want to check that the desired pseudo-inverse properties hold. Recall: The Moore-Penrose pseudo inverse of a matrix $$\mathrm{K}$$ is the matrix $$\mathrm{K}^{+}$$ that fulfils

1. $$\mathrm{K}\mathrm{K}^{+} \mathrm{K}=\mathrm{K}$$
2. $$\mathrm{K}^{+} \mathrm{K} \mathrm{K}^{+}=\mathrm{K}^{+}$$
3. $$\mathrm{K}\mathrm{K}^{+}$$ symmetric
4. $$\mathrm{K}^{+} \mathrm{K}$$ symmetric

The last two are immediate. We might want to check the first two. Let us consider the first one, by way of example. Trying to calculate its properties by iterating the various pseudo-inverse rules is tedious, so let us consider what happens if we use the constructive form for the pseudoinverse, $$\mathrm{K}= \mathrm{Z} \mathrm{Z}^{\dagger}$$. We can deduce more: \begin{aligned} \mathrm{K}\mathrm{K}^{+} \mathrm{K} &=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}\mathrm{U}_{\mathrm{Z}}^{\dagger} \mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger} \mathrm{K}\\ &=\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}} \mathrm{S}_{\mathrm{Z}}^+\mathrm{S}_{\mathrm{Z}}^+\mathrm{U}_{\mathrm{Z}}^{\dagger} \mathrm{K}\\ &=\mathrm{U}_{\mathrm{Z}}\mathrm{I}_{\mathrm{Z}}\mathrm{U}_{\mathrm{Z}}^{\dagger} \mathrm{K}\\ &=\mathrm{K}. \end{aligned} Here $$\mathrm{I}_{\mathrm{Z}}$$ was a diagonal matrix with only as many non-zero entries as $$\mathrm{Z}$$ had column rank, which ends up not poisoning the equality, I think, since $$\mathrm{K}$$ had only that much rank anyway.

Anyway, this sloppy reasoning should encourage us to believe we have done nothing too silly here. I presume a proof for property 2 would be similar, but I have not actually done it. (Homework problem).

Is this pseudo-inverse low rank, though? Looks like it. In particular, we know that $$\mathrm{Z}$$ had only $$N$$ columns, and so $$\mathrm{S}_{\mathrm{Z}}$$ has at most $$N$$ non-zero entries. So we can take the thin SVD and know that that it will only preserve at most $$N$$ columns of $$\mathrm{S}_{\mathrm{Z}}$$, which is to say that we may as well take $$\mathrm{S}_{\mathrm{Z}}^+$$ to be $$N\times N$$, $$\mathrm{U}_{\mathrm{Z}}$$ to be $$D\times N$$ and the low-rank factor $$\mathrm{U}_{\mathrm{Z}}\mathrm{S}_{\mathrm{Z}}^+$$ to be $$D\times N$$.

Bonus detail: The SVD is not necessarily unique, even the reduced SVD, if there are singular values that are repeated. I think that for my purposes this is OK to ignore, but noting it here in anticipation of weird failure modes in the future.

tl;dr: pseudo-inverses of low-rank matrices are pseudo-inverses of the low rank factors, and behave more or less as I would expect. Probably.

## Distances

### Frobenius

Suppose we want to measure the Frobenius distance between $$\mathrm{U}\mathrm{U}^{\dagger}$$ and $$\mathrm{R}\mathrm{R}^{\dagger}$$. We recall that we might expect things to be nice if they are exactly low rank because, e.g. \begin{aligned} \|\mathrm{U}\mathrm{U}^{\dagger}\|_F^2 =\operatorname{tr}\left(\mathrm{U}\mathrm{U}^{\dagger}\mathrm{U}\mathrm{U}^{\dagger}\right) =\|\mathrm{U}^{\dagger}\mathrm{U}\|_F^2. \end{aligned} Indeed things are nice, and the answer may be found without forming the full matrices: \begin{aligned} &\|\mathrm{U}\mathrm{U}^{\dagger}-\mathrm{R}\mathrm{R}^{\dagger}\|_F^2\\ &=\left\|\mathrm{U}\mathrm{U}^{\dagger} -\mathrm{R}\mathrm{R}^{\dagger} \right\|_{F}^2\\ &=\left\|\mathrm{U}\mathrm{U}^{\dagger}+i\mathrm{R}i\mathrm{R}^{\dagger} \right\|_{F}^2\\ &=\left\|\begin{bmatrix} \mathrm{U} &i\mathrm{R}\end{bmatrix}\begin{bmatrix} \mathrm{U} &i\mathrm{R}\end{bmatrix}^{\dagger} \right\|_{F}^2\\ &=\operatorname{Tr}\left(\begin{bmatrix} \mathrm{U} &i\mathrm{R}\end{bmatrix}\begin{bmatrix} \mathrm{U} &i\mathrm{R}\end{bmatrix}^{\dagger}\begin{bmatrix} \mathrm{U} &i\mathrm{R}\end{bmatrix}\begin{bmatrix} \mathrm{U} &i\mathrm{R}\end{bmatrix}^{\dagger}\right)\\ &=\operatorname{Tr}\left(\left(\mathrm{U}\mathrm{U}^{\dagger} -\mathrm{R}\mathrm{R}^{\dagger}\right)\left(\mathrm{U}\mathrm{U}^{\dagger} -\mathrm{R}\mathrm{R}^{\dagger}\right)\right)\\ &=\operatorname{Tr}\left(\mathrm{U}\mathrm{U}^{\dagger}\mathrm{U}\mathrm{U}^{\dagger}\right) -2\operatorname{Tr}\left(\mathrm{U}\mathrm{U}^{\dagger}\mathrm{R}\mathrm{R}^{\dagger}\right) + \operatorname{Tr}\left(\mathrm{R}\mathrm{R}^{\dagger}\mathrm{R}\mathrm{R}^{\dagger}\right)\\ &=\left\|\mathrm{U}^{\dagger}\mathrm{U}\right\|^2_F -2\left\|\mathrm{U}^{\dagger}\mathrm{R}\right\|^2_F + \left\|\mathrm{R}^{\dagger}\mathrm{R}\right\|^2_F. \end{aligned}

### No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.