Random rotations

May 18, 2021 — December 1, 2021

functional analysis
high d
linear algebra
signal processing
sparser than thou
Figure 1

Placeholder for random measures on the special orthogonal group, which encodes random \(d\)-dimensional rotations. A special type of matrix random variate if you’d like.

1 Uniform random rotations

Key word: Haar measure.

Chatterjee and Meckes (2008) gives us a useful lemma describing the moments of a random rotation:

If \(U=\left[u_{i j}\right]_{i, j=1}^{n}\) is an orthogonal matrix distributed according to Haar measure, then \(\mathbb{E}\left[\prod u_{i j}^{k_{i j}}\right]\) is non-zero if and only if the number of entries from each row and from each column is even. Second and fourth-degree moments are as follows:

  1. For all \(i, j\), \[ \mathbb{E}\left[u_{i j}^{2}\right]=\frac{1}{n} \]
  2. For all \(i, j, r, s, \alpha, \beta, \lambda, \mu\), \[\begin{aligned} &\mathbb{E}\left[u_{i j} u_{r s} u_{\alpha \beta} u_{\lambda \mu}\right]\\ &=- \frac{1}{(n-1) n(n+2)}\left[\delta_{i r} \delta_{\alpha \lambda} \delta_{j \beta} \delta_{s \mu}+\delta_{i r} \delta_{\alpha \lambda} \delta_{j \mu} \delta_{s \beta}+\delta_{i \alpha} \delta_{r \lambda} \delta_{j s} \delta_{\beta \mu}\right.\\ &\quad\left.+\delta_{i \alpha} \delta_{r \lambda} \delta_{j \mu} \delta_{\beta s}+\delta_{i \lambda} \delta_{r \alpha} \delta_{j s} \delta_{\beta \mu}+\delta_{i \lambda} \delta_{r \alpha} \delta_{j \beta} \delta_{s \mu}\right] \\ &+\frac{n+1}{(n-1) n(n+2)}\left[\delta_{i r} \delta_{\alpha \lambda} \delta_{j s} \delta_{\beta \mu}+\delta_{i \alpha} \delta_{r \lambda} \delta_{j \beta} \delta_{s \mu}+\delta_{i \lambda} \delta_{r \alpha} \delta_{j \mu} \delta_{s \beta}\right] \end{aligned}\]

Let \(H \in \mathcal{O}_{n}\) be distributed according to Haar measure. Then \(\mathbb{E}\left[h_{i j} h_{k l} h_{m n} h_{p q}\right]\) is only non-zero if there are an even number of entries from each row and each column. Fourth-order mixed moments are as follows. (i) \(\mathbb{E}\left[h_{11}^{2} h_{12}^{2}\right]=\frac{1}{n(n+2)}\) (ii) \(\mathbb{E}\left[h_{11}^{2} h_{22}^{2}\right]=\frac{n+1}{(n-1) n(n+2)}\) (iii) \(\mathbb{E}\left[h_{11} h_{12} h_{21} h_{22}\right]=\frac{-1}{(n-1) n(n+2)}\), and (iv) \(\mathbb{E}\left[\left(h_{i 1} h_{i^{\prime} 2}-h_{i 2} h_{i^{\prime} 1}\right)\left(h_{j 1} h_{j^{\prime} 2}-h_{j 2} h_{j^{\prime} 1}\right)\right]=\frac{2}{n(n-1)}\left[\delta_{i j} \delta_{i^{\prime} j^{\prime}}-\delta_{i j^{\prime}} \delta_{i^{\prime} j}\right]\), where \(i \neq i^{\prime}\) and \(j \neq j^{\prime}\)

2 Tiny random rotations

Notation: \(\oplus\) is the block direct sum.

2.1 Random Givens rotation

Givens rotation are useful. We can parameterize a Givens rotation \(\mathrm{A}^{(t)}\) by either an angle \(t=\theta\) or a magnitude \(t=\epsilon=\sin \theta\). The latter is slightly easier. Formally, a Givens rotation can be defined in any dimension \(d\geq 2\) by padding it with zeros and adding it to an identity matrix, but all the action happens over 4 entries, so we can learn how it works by examining a minimal \(2\times2\) rotation as a perturbation of the identity.

For \(-1\leq t\leq 1\) fixed, let us define a rotation \[ \begin{aligned} \mathrm{A}^{(t)} &=\left[\begin{array}{cc} \sqrt{1-t^{2}} & t \\ -t & \sqrt{1-t^{2}} \end{array}\right]\\ &=\mathrm{I}_{2}+\left[\begin{array}{cc} -\frac{t^{2}}{2}+\delta & t \\ -t & -\frac{t^{2}}{2}+\delta \end{array}\right] \\ \end{aligned} \] where \(\delta\) is a deterministic constant and \(\delta=O\left(t^{4}\right).\)

Informally (because I do not have space or will to track lots of boring remainder terms) we note that for some fixed \(x=[x_1\,x_2]^\top\) we have \[\begin{aligned} \mathrm{A}^{(t)}xx^\top\mathrm{A}^{(t)}{}^\top &\approx\left( \begin{array}{cc} x_1^2+2 t x_1 x_2+t^2 \left(-x_1^2+x_2^2\right) & x_1 x_2-2 t^2 x_1 x_2+t \left(-x_1^2+x_2^2\right) \\ x_1 x_2-2 t^2 x_1 x_2+t\left(-x_1^2+x_2^2\right) & -2 t x_1 x_2+x_2^2+t^2 \left(x_1^2-x_2^2\right) \\ \end{array} \right)\\ &=xx^\top+\left( \begin{array}{cc} 2 x_1 x_2 & -x_1^2+x_2^2 \\ -x_1^2+x_2^2 & -2 x_1 x_2 \\ \end{array} \right)t+\left( \begin{array}{cc} -x_1^2+x_2^2 & -2 x_1 x_2 \\ -2 x_1 x_2 & x_1^2-x_2^2 \\ \end{array} \right)t^2 \end{aligned}\]

If we want this to work in a higher dimension, we pad it like this: \[\begin{aligned} \mathrm{A}^{(t)} &=\left[\begin{array}{cc} \sqrt{1-t^{2}} & t \\ -t & \sqrt{1-t^{2}} \end{array}\right] \oplus \mathrm{I}_{d-2} \\ &=\mathrm{I}_{d}+\left[\begin{array}{cc} -\frac{t^{2}}{2}+\delta & t \\ -t & -\frac{t^{2}}{2}+\delta \end{array}\right] \oplus 0_{d-2} \end{aligned}\]

Now, let us consider how to randomise these. If we fix two axes and a random \(t\) we can construct a random Givens rotation. Suppose \(\mathbb{E}[t]=0\), \(\operatorname{var}(t)=\varepsilon^2\) small, and proceeding to do a sloppy Taylor expansion (which will, I assert, not matter in practice when take limits as \(t\to 0\) but one should check this to be rigorous). \[\begin{aligned} \mathbb{E}[\mathrm{A}^{(t)}x] &\approx\mathbb{E}\left[\left( \begin{array}{c} x_1+t x_2-\frac{t^2 x_1}{2} \\ x_2-t x_1-\frac{t^2 x_2}{2} \\ \end{array} \right)\right]\\ &=x-\mathbb{E}[t^2]x. \end{aligned}\]

\[\begin{aligned} \mathbb{E}[\mathrm{A}^{(t)}xx^\top\mathrm{A}^{(t)}{}^\top] &\approx \mathbb{E}\left[xx^\top+\left( \begin{array}{cc} 2 x_1 x_2 & -x_1^2+x_2^2 \\ -x_1^2+x_2^2 & -2 x_1 x_2 \\ \end{array} \right)t+\left( \begin{array}{cc} -x_1^2+x_2^2 & -2 x_1 x_2 \\ -2 x_1 x_2 & x_1^2-x_2^2 \\ \end{array} \right)t^2\right]\\ &= xx^\top + \left( \begin{array}{cc} -x_1^2+x_2^2 & -2 x_1 x_2 \\ -2 x_1 x_2 & x_1^2-x_2^2 \\ \end{array} \right)\mathbb{E}[t^2] \end{aligned} \]

2.2 Doubly Random Givens rotation

If each Givens rotation is over a pair of distinct axes drawn uniformly at random, this will also give us a random rotation. Book keeping for these could be irritating, but let us see. wlog we can generate these by choosing the axes \(i\sim\mathcal{U}(\{1,2,\dots,n-1\},j\sim\mathcal{U}(\{i+1,i,\dots,n\}.\)

2.3 Random rotation in an isotropic direction

Suppose \(\mathrm{Q}\) is a Haar-distributed (i.e. uniform) random rotation. Then we can make a tiny incremental rotation using a construction of Chatterjee and Meckes (2008), by applying a small Given rotation in a randomly rotated coordinate frame. The result is a rotation in a random direction, but not far in that direction.

Then \(\mathrm{Q}^{\varepsilon}:=\mathrm{Q} \mathrm{A}_{\varepsilon} \mathrm{Q}^{\top}\) is a small random rotation. What can we say about this guy? Let \(Q_{1:2}\) be the \(d \times 2\) matrix made of the first two columns of \(\mathrm{Q}\) and \(C_{2}=\left[\begin{array}{cc}0 & 1 \\ -1 & 0\end{array}\right]\). Then \[ \mathrm{Q}^{\varepsilon}=\epsilon\left[-\left(\frac{\epsilon}{2}+\epsilon^{-1} \delta\right) Q_{1:2} Q_{1:2}^{\top}+ Q_{1:2}C_2 Q_{1:2}^{\top}\right] \] and \(\epsilon^{-1} \delta=O\left(\epsilon^{3}\right)\).

Once again referring to Chatterjee and Meckes (2008) we see that the matrix \(M=\left[m_{i j}\right]_{i, j=1}^{n}=M_{1:2}C_2 M_{1:2}^{\top}\) satisfies \(m_{i j}=u_{i 1} u_{j 2}-u_{i 2} u_{j 1} .\) For all \(i, j, \ell, p\), \[ \mathbb{E}\left[m_{i j} m_{\ell p}\right]=\frac{2}{n(n-1)}\left[\delta_{i \ell} \delta_{j p}-\delta_{i p} \delta_{j \ell}\right] \] By the lemma on expectations of random rotations, \(\mathbb{E}\left[Q_{1:2} Q_{1:2}^{\top}\right]=\frac{2}{n} I\).

\[\begin{aligned} \mathrm{Q}^{\varepsilon}-\mathrm{I} &\approx \mathrm{Q}\left( \mathrm{I}_{d} +\left[\begin{array}{cc} -\frac{\varepsilon^{2}}{2} & \varepsilon \\ -\varepsilon & -\frac{\varepsilon^{2}}{2} \end{array}\right] \oplus 0_{d-2} \right)\mathrm{Q}^\top-\mathrm{I}\\ &=\mathrm{Q}\mathrm{Q}^\top +\mathrm{Q}\left(\left[\begin{array}{cc} -\frac{\varepsilon^{2}}{2} & \varepsilon \\ -\varepsilon & -\frac{\varepsilon^{2}}{2} \end{array}\right] \oplus 0_{d-2}\right)\mathrm{Q}^\top -\mathrm{I}\\ &=\mathrm{Q}\left(\left[\begin{array}{cc} -\frac{\varepsilon^{2}}{2} & \varepsilon \\ -\varepsilon & -\frac{\varepsilon^{2}}{2} \end{array}\right] \oplus 0_{d-2}\right)\mathrm{Q}^\top\\ &=\text{TBD} \end{aligned}\]


3 Sparse

We could desire a random rotation be sparse in a few senses, e.g. “ arandom rotation in \(D\) dimensions described by \(N\ll D\) Givens rotations.” Thigns that fit this description are fairly easy to generate: we generate a small, possibly random, number of Givens rotations.

A context useful for me is to take something that is in expectation a rotation, but whose realisations are sparse.

4 References

Barthe, Guedon, Mendelson, et al. 2005. A Probabilistic Approach to the Geometry of the \(\ell_p^n\)-Ball.” The Annals of Probability.
Chatterjee, and Meckes. 2008. Multivariate Normal Approximation Using Exchangeable Pairs.” arXiv:math/0701464.
Chikuse. 2003. High Dimensional Asymptotic Theorems.” In Statistics on Special Manifolds.
Chikuse, and 筑瀬. 2003. Statistics on Special Manifolds.
Christensen. 1970. On Some Measures Analogous to Haar Measure. MATHEMATICA SCANDINAVICA.
Kaneko, Fiori, and Tanaka. 2013. Empirical Arithmetic Averaging Over the Compact Stiefel Manifold.” IEEE Transactions on Signal Processing.
Kar, and Karnick. 2012. Random Feature Maps for Dot Product Kernels.” In Artificial Intelligence and Statistics.
Meckes. 2012. Approximation of Projections of Random Vectors.” Journal of Theoretical Probability.
Stam. 1982. Limit Theorems for Uniform Distributions on Spheres in High-Dimensional Euclidean Spaces.” Journal of Applied Probability.