Nearly-low-rank Hermitian matrices
a.k.a. perturbations of the identity, low-rank-plus-diagonal matrices
August 5, 2014 — October 28, 2024
A decomposition that some matrix might possess that ends up being practically useful:
After many arguments with reviewers I have discovered that you should not use this name, because nearly low rank to some people means most of the eigenvalues are close to zero and cannot abide any other meaning to that phrase. I would parse that as meaning they think near means near in the nuclear norm; there is nothing particularly canonical about that interpretation, IMO, but I will use any bloody name if it saves me time arguing with reviewers. Or even a contorted acronym, how about that?
So, if you are one such person, who finds nearly low rank to be a frustration, imagine that I called them Perturbation Enhanced Diagonal Adjusted Numerical Tensors; this might help.
This matrix structure is a workhorse. We might get such low-rank decompositions from matrix factorisation or from some prior structure in the problem; To pick one example, Ensemble filters have covariance matrices with such a form. In many applications,
Sometimes we admit a more general version, where the diagonal is allowed to be other-than-constant, so that
1 Factorisations
1.1 Eigendecomposition
Nakatsukasa (2019) observes that, for general
The nonzero eigenvalues of
are equal to those of : an identity that holds as long as the products are square, even when are rectangular. This fact naturally suggests an efficient algorithm for computing eigenvalues and eigenvectors of a low-rank matrix with : form the small matrix and find its eigenvalues and eigenvectors. For nonzero eigenvalues, the eigenvectors are related by with […]
Cool. Our low-rank matrix
A classic piece of lore is cheap eigendecomposition of
1.2 Square roots
One useful trick, at matrix square roots. Note that it is not necessarily closed, i.e. it does not generate a new low rank matrix.
1.3 Cholesky
Louis Tiao, in Efficient Cholesky decomposition of low-rank updates summarises Seeger (2004).
1.4 Cholesky decomposition of a harmonic mean
This one pops up from time to time too. Suppose
2 Inverting
Specifically, solving
- exploit the computational efficiency of the low-rank structure of
so that it costs less than to compute . - avoid forming the explicit inverse matrix
which requires storage .
The workhorse tool for these is the Woodbury identity, which has many variants, but often the basic version does what we want: Assuming both
There is a connection to the eigendecomposition clearly; we may also think about inversion by operating on the eigenvalues.
2.1 Classic
I have no idea who invented this; it seems to be part of the folklore now, but also it was not in my undergraduate degree.
Assume
The price of this efficient inversion is that pre-multiplying by the nearly-low-rank inverse is not as numerically stable as classic matrix solution methods; but for many purposes, this price is acceptable.
What else can we say now? Well, if
The eigenvalues
Note that the inverse is also a nearly-low-rank matrix, but with
In principle,
Terminology: For some reason, these
If such a capacitance is merely positive semidefinite, then we need to do some more work to make it unique. And if it is indefinite (presumably because of numerical stability problems), then we are potentially in trouble because the square root is not real. We can still have complex-valued solutions if we want to go there.
2.2 General diagonals
OK, how about if we admit general diagonal matrices
2.3 Centred
Suppose
3 Determinant
The matrix determinant lemma tells us:
Suppose
is an invertible matrix and are matrices. Then In the special case
this is the Weinstein-Aronszajn identity. Given additionally an invertible -by- matrix , the relationship can also be expressed as
Clearly,
4 Products
4.1 Primal
Specifically,
4.2 Inverse
Suppose the low-rank inverse factors of
Once again, cheap to evaluate, but not so obviously nice.
5 Distances
5.1 Frobenius
Suppose we want to measure the Frobenius distance between
6 Stochastic approximation
TBC
7 Tools
There is support for some of the simplifications mentioned for PyTorch’s linear algebra in cornellius-gp/linear_operator.
8 Incoming
Discuss in terms of perturbation theory? (Rellich 1954; Bamieh 2022)
Bamieh (2022) in particular is compact and clear.