Randomised linear algebra

August 16, 2016 — October 22, 2022

feature construction
functional analysis
high d
linear algebra
model selection
probabilistic algorithms
signal processing
sparser than thou
Figure 1

See random matrices, vector random projections and many other related tricks Notes on doing linear algebra operations using randomised matrix projections. Useful for, e.g. randomised regression.

1 Context

Obligatory Igor Carron mention: Random matrices are too damn large. Martinsson (2016) seems to be a fresh review of the action.

2 Log det and trace estimation

Machine Learning Trick of the Day (3): Hutchinson’s Trick— Shakir Mohammed

3 Random Fourier Features

See Random Fourier Features.

4 Randomisation in matrix factorization

See various matrix factorisation methods.

5 Random regression

See randomised regression

6 Hutchinson trace estimator

Shakir Mohamed mentions Hutchinson’s Trick, and was introduced to it, as I was, by Dr Maurizio Filippone. This trick also works with efficiently with the ensemble Kalman filter, where the randomised products are cheap.

7 Stochastic Lanczos Quadrature

Overview — imate Manual

8 Tools

8.1 imate

Overview — imate

The main purpose of ımate is to estimate the algebraic quantity \[ \operatorname{trace}(f(\mathbf{A})) \] where \(\mathbf{A}\) is a square matrix, \(f\) is a matrix function, and trace \((\cdot)\) is the trace operator. Imate can also compute variants of \((1)\), such as \[ \operatorname{trace}(\mathbf{B} f(\mathbf{A})) \] and \[ \operatorname{trace}(\mathbf{B} f(\mathbf{A}) \mathbf{C} f(\mathbf{A})) \] where \(\mathbf{B}\) and \(\mathbf{C}\) are matrices. Other variations include the cases where \(\mathbf{A}\) is replaced by \(\mathbf{A}^{\top} \mathbf{A}\) in the above expressions.

8.2 Misc

9 References

Achlioptas. 2003. Database-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins.” Journal of Computer and System Sciences, Special Issue on PODS 2001,.
Achlioptas, and Mcsherry. 2007. Fast Computation of Low-Rank Matrix Approximations.” J. ACM.
Alaoui, and Mahoney. 2014. Fast Randomized Kernel Methods With Statistical Guarantees.” arXiv:1411.0306 [Cs, Stat].
Bardenet, and Maillard. 2015. A Note on Replacing Uniform Subsampling by Random Projections in MCMC for Linear Regression of Tall Datasets.”
Bingham, and Mannila. 2001. Random Projection in Dimensionality Reduction: Applications to Image and Text Data.” In Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’01.
Boutsidis, Drineas, Kambadur, et al. 2017. A Randomized Algorithm for Approximating the Log Determinant of a Symmetric Positive Definite Matrix.” Linear Algebra and Its Applications.
Brossollet, Cappelli, Carron, et al. 2021. LightOn Optical Processing Unit: Scaling-up AI and HPC with a Non von Neumann Co-Processor.”
Cavaillès, Boucher, Daudet, et al. 2022. A High-Fidelity and Large-Scale Reconfigurable Photonic Processor for NISQ Applications.” Optics Express.
Choromanski, Rowland, and Weller. 2017. The Unreasonable Effectiveness of Random Orthogonal Embeddings.” arXiv:1703.00864 [Stat].
Dasgupta. 2000. Experiments with Random Projection.” In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence. UAI’00.
Dasgupta, and Gupta. 2003. An Elementary Proof of a Theorem of Johnson and Lindenstrauss.” Random Structures & Algorithms.
Dezfouli, and Bonilla. 2015. Scalable Inference for Gaussian Process Models with Black-Box Likelihoods.” In Advances in Neural Information Processing Systems 28. NIPS’15.
Drineas, Kannan, and Mahoney. 2006a. Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix.” SIAM Journal on Computing.
———. 2006b. Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition.” SIAM Journal on Computing.
Drineas, and Mahoney. 2005. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning.” Journal of Machine Learning Research.
Edelman. 1989. Eigenvalues and Condition Numbers of Random Matrices.”
Erichson, Voronin, Brunton, et al. 2016. Randomized Matrix Decompositions Using R.” arXiv:1608.02148 [Cs, Stat].
Golub, and Meurant. 2010. Matrices, Moments and Quadrature with Applications.
Gottwald, and Reich. 2020. Supervised Learning from Noisy Observations: Combining Machine-Learning Techniques with Data Assimilation.” arXiv:2007.07383 [Physics, Stat].
Gower. 2016. Sketch and Project: Randomized Iterative Methods for Linear Systems and Inverting Matrices.” arXiv:1612.06013 [Math].
Gower, and Richtárik. 2016. Randomized Quasi-Newton Updates Are Linearly Convergent Matrix Inversion Algorithms.” arXiv:1602.01768 [Cs, Math].
Halko, Martinsson, Shkolnisky, et al. 2011. An Algorithm for the Principal Component Analysis of Large Data Sets.” SIAM Journal on Scientific Computing.
Halko, Martinsson, and Tropp. 2010. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions.”
Hesslow, Cappelli, Carron, et al. 2021. Photonic Co-Processors in HPC: Using LightOn OPUs for Randomized Numerical Linear Algebra.” arXiv:2104.14429 [Cs, Stat].
Hutchinson. 1990. A Stochastic Estimator of the Trace of the Influence Matrix for Laplacian Smoothing Splines.” Communications in Statistics - Simulation and Computation.
Kapralov, Potluru, and Woodruff. 2016. How to Fake Multiply by a Gaussian Matrix.” arXiv:1606.05732 [Cs].
Krummenacher, McWilliams, Kilcher, et al. 2016. Scalable Adaptive Stochastic Optimization Using Random Projections.” In Advances in Neural Information Processing Systems 29.
Kumar, and Shneider. 2016. Literature Survey on Low Rank Approximation of Matrices.” arXiv:1606.06511 [Cs, Math].
Lahiri, Gao, and Ganguli. 2016. Random Projections of Random Manifolds.” arXiv:1607.04331 [Cs, q-Bio, Stat].
Launay, Poli, Boniface, et al. 2020. Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures.” In Advances in Neural Information Processing Systems.
Lee, and Seung. 2001. Algorithms for Non-Negative Matrix Factorization.” In Advances in Neural Information Processing Systems 13.
Liberty, Woolfe, Martinsson, et al. 2007. Randomized Algorithms for the Low-Rank Approximation of Matrices.” Proceedings of the National Academy of Sciences.
Li, Hastie, and Church. 2006. Very Sparse Random Projections.” In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’06.
Mahoney. 2010. Randomized Algorithms for Matrices and Data.
Martinsson. 2016. Randomized Methods for Matrix Computations and Analysis of High Dimensional Data.” arXiv:1607.01649 [Math].
Martinsson, Rockhlin, and Tygert. 2006. A Randomized Algorithm for the Approximation of Matrices.”
Martinsson, and Voronin. 2015. A Randomized Blocked Algorithm for Efficiently Computing Rank-Revealing Factorizations of Matrices.” arXiv:1503.07157 [Math].
Meyer, Musco, Musco, et al. 2021. Hutch++: Optimal Stochastic Trace Estimation.”
Nakatsukasa. 2017. Accuracy of Singular Vectors Obtained by Projection-Based SVD Methods.” BIT Numerical Mathematics.
Pourkamali-Anaraki, and Becker. 2016. A Randomized Approach to Efficient Kernel Clustering.” arXiv:1608.07597 [Stat].
Rabani, and Toledo. 2001. Out-of-Core SVD and QR Decompositions.” In PPSC.
Rahimi, and Recht. 2007. Random Features for Large-Scale Kernel Machines.” In Advances in Neural Information Processing Systems.
———. 2009. Weighted Sums of Random Kitchen Sinks: Replacing Minimization with Randomization in Learning.” In Advances in Neural Information Processing Systems.
Refinetti, d’Ascoli, Ohana, et al. 2021. Align, Then Memorise: The Dynamics of Learning with Feedback Alignment.” arXiv:2011.12428 [Cond-Mat, Stat].
Rokhlin, Szlam, and Tygert. 2009. A Randomized Algorithm for Principal Component Analysis.” SIAM J. Matrix Anal. Appl.
Rokhlin, and Tygert. 2008. A Fast Randomized Algorithm for Overdetermined Linear Least-Squares Regression.” Proceedings of the National Academy of Sciences.
Saibaba, Alexanderian, and Ipsen. 2017. Randomized Matrix-Free Trace and Log-Determinant Estimators.”
Shi, Sun, and Zhu. 2018. A Spectral Approach to Gradient Estimation for Implicit Distributions.” In.
Skorski. 2020. A Modern Analysis of Hutchinson’s Trace Estimator.”
Sterge, and Sriperumbudur. 2021. Statistical Optimality and Computational Efficiency of Nyström Kernel PCA.” arXiv:2105.08875 [Cs, Math, Stat].
Tropp. 2015. An Introduction to Matrix Concentration Inequalities.
Tropp, Yurtsever, Udell, et al. 2016. Randomized Single-View Algorithms for Low-Rank Matrix Approximation.” arXiv:1609.00048 [Cs, Math, Stat].
Voronin, and Martinsson. 2014. Efficient Algorithms for CUR and Interpolative Matrix Decompositions.” arXiv:1412.8447 [Cs, Math].
Wang. 2015. A Practical Guide to Randomized Matrix Computations with MATLAB Implementations.” arXiv:1505.07570 [Cs].
Woolfe, Liberty, Rokhlin, et al. 2008. A Fast Randomized Algorithm for the Approximation of Matrices.” Applied and Computational Harmonic Analysis.
Yang, Tianbao, Li, Mahdavi, et al. 2012. Nyström Method Vs Random Fourier Features: A Theoretical and Empirical Comparison.” In Advances in Neural Information Processing Systems.
Yang, Jiyan, Meng, and Mahoney. 2015. Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments.” arXiv:1502.03032 [Cs, Math, Stat].
Yang, Jiyan, Sindhwani, Avron, et al. 2014. Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels.” arXiv:1412.8293 [Cs, Math, Stat].
Yu, Hsieh, Si, et al. 2012. Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems.” In IEEE International Conference of Data Mining.
Zhang, Liu, Zhang, et al. 2017. Randomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling.” In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’17.
Zhao. 2017. Fast Algorithms on Random Matrices and Structured Matrices.”
Zhou, and Tao. 2011. GoDec: Randomized Low-Rank & Sparse Matrix Decomposition in Noisy Case.” In Proceedings of the 28th International Conference on International Conference on Machine Learning. ICML’11.