Singular Value Decomposition

The ML workhorse for linear algebra

2014-08-05 — 2025-05-23

feature construction
functional analysis
high d
linear algebra
networks
probability
signal processing
sparser than thou
statistics

Assumed audience:

People with undergrad linear algebra

An important matrix factorisation. TBC

Figure 1

1 What is SVD?

For any A\Rm×n of rank r, the (thin) singular‐value decomposition is

A=UΣVT,

where

  • U\Rm×r has orthonormal columns (UTU=Ir),
  • V\Rn×r has orthonormal columns (VTV=Ir),
  • Σ=diag(σ1,,σr), with σ1σr>0.

You can do SO MUCH with this, but rapidly run in to problems of computational cost and numerical stability if you are naive about it.

2 Randomized methods

There are a few mentioned; e.g. everyone name-checks Halko, Martinsson, and Tropp () and Bach et al. (). I’ve used Halko, Martinsson, and Tropp () a lot because it is included in PyTorch. I am curious about Allen-Zhu and Li () which claims to be stable at low floating point precision, which is very desirable in these big-network times.

3 Incremental updates / downdates

A rank-1 incremental update corresponds to appending a new column a\Rm to A: Given current SVD A=UΣVT, form

A=[Aa]\Rm×(n+1).

We want A=UΣVT cheaply.

  1. Project a onto span(U):

    w=UTa(\Rr),

    and compute the residual

    p=aUw,α=p2.

    If α>0, set q=p/α (else pick any unit qU).

  2. Form the small “core”

    K=[Σw0α]\R(r+1)×(r+1).

  3. Compute SVD of K:

    K=U~Σ~V~T,

    where U~,V~\R(r+1)×(r+1).

  4. Update full factors:

    U=[Uq]U~,Σ=Σ~,V=[V001]V~.

Cost: SVD of one (r+1)×(r+1) matrix instead of m×n.

There are also downdates (removing a column), which are pretty similar.

For fancy variations, there are many papers (, ; ; , ; ; ).

4 …as Frobenius minimiser

TODO.

5 …for PCA

This is nearly trivial but needs spelling out: SVD gives us the PCA, among other useful things.

6 Incoming

7 References

Allen-Zhu, and Li. 2017. LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain.”
Bach, Ceglia, Song, et al. 2019. Randomized Low-Rank Approximation Methods for Projection-Based Model Order Reduction of Large Nonlinear Dynamical Problems.” International Journal for Numerical Methods in Engineering.
Brand. 2002. Incremental Singular Value Decomposition of Uncertain Data with Missing Values.” In Computer Vision — ECCV 2002.
———. 2006. Fast Low-Rank Modifications of the Thin Singular Value Decomposition.” Linear Algebra and Its Applications, Special Issue on Large Scale Linear and Nonlinear Eigenvalue Problems,.
Bunch, and Nielsen. 1978. Updating the Singular Value Decomposition.” Numerische Mathematik.
Gu, and Eisenstat. 1993. “A Stable and Fast Algorithm for Updating the Singular Value Decomposition.”
———. 1995. Downdating the Singular Value Decomposition.” SIAM Journal on Matrix Analysis and Applications.
Halko, Martinsson, and Tropp. 2010. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions.”
Hastie, Mazumder, Lee, et al. 2015. Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares.” In Journal of Machine Learning Research.
Rabani, and Toledo. 2001. Out-of-Core SVD and QR Decompositions.” In PPSC.
Saad. 2003. Iterative Methods for Sparse Linear Systems: Second Edition.
Sarwar, Karypis, Konstan, et al. 2002. “Incremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems.”
Tropp, Yurtsever, Udell, et al. 2016. Randomized Single-View Algorithms for Low-Rank Matrix Approximation.” arXiv:1609.00048 [Cs, Math, Stat].
Zhang. 2022. An Answer to an Open Question in the Incremental SVD.”