An important matrix factorisation. TBC
1 What is SVD?
For any
where
has orthonormal columns ( ), has orthonormal columns ( ), , with .
You can do SO MUCH with this, but rapidly run in to problems of computational cost and numerical stability if you are naive about it.
2 Randomized methods
There are a few mentioned; e.g. everyone name-checks Halko, Martinsson, and Tropp (2010) and Bach et al. (2019). I’ve used Halko, Martinsson, and Tropp (2010) a lot because it is included in PyTorch. I am curious about Allen-Zhu and Li (2017) which claims to be stable at low floating point precision, which is very desirable in these big-network times.
3 Incremental updates / downdates
A rank-1 incremental update corresponds to appending a new column
We want
Project
onto :and compute the residual
If
, set (else pick any unit ).Form the small “core”
Compute SVD of
:where
.Update full factors:
Cost: SVD of one
There are also downdates (removing a column), which are pretty similar.
For fancy variations, there are many papers (Brand 2006, 2002; Bunch and Nielsen 1978; Gu and Eisenstat 1995, 1993; Sarwar et al. 2002; Zhang 2022).
4 …as Frobenius minimiser
TODO.
5 …for PCA
This is nearly trivial but needs spelling out: SVD gives us the PCA, among other useful things.