Approximate matrix factorisations and decompositions

Sometimes even exact



Assumed audience:

People with undergrad linear algebra

The classics

The big six exact matrix decompositions are (Stewart 2000)

See Nick Highamโ€™s summary of those.

Approximate decompositions

Mastered QR and LU decompositions? There are now so many ways of factorising matrices that there are not enough acronyms in the alphabet to hold them, especially if we suspect our matrix is sparse, or could be made sparse because of some underlying constraint, or probably could, if squinted at in the right fashion, be such as a graph transition matrix, or Laplacian, or noisy transform of some smooth object, or at least would be close to sparse if we chose the right metric, orโ€ฆ

A big matrix is close to, in some sense, the (tensor/matrix) product (or sum, orโ€ฆ) of some matrices that are in some way simple (small-rank, small dimension, sparse), possibly with additional constraints. Can we find those simple matrices?

Ethan Epperlyโ€™s introduction to Low-rank Matrices puts many ideas clearly.

Hereโ€™s an example: Godec (T. Zhou and Tao 2011) โ€” A decomposition into low-rank and sparse components which loosely speaking, combines multidimensional factorisation and outlier detection.

GoDec is one of the most efficient algorithm for low-rank and sparse decomposition thanks to bilateral random projections (BRP), a fast approximation of SVD/PCA.

There are so many more of these things, depending on our preferred choice of metric, constraints, free variables and such.

Keywords: Matrix sketching, low-rank approximation, traditional dimensionality reduction.

Matrix concentration inequalities turn out to a useful tool to prove that a given matrix decomp is not too bad in a PAC-sense.

Igor Carronโ€™s Matrix Factorization Jungle classifies the following problems as matrix-factorisation type.

Kernel Factorizations
โ€ฆ
Spectral clustering
\([\mathrm{A} = \mathrm{D} \mathrm{X}]\) with unknown \(\mathrm{D}\) and \(\mathrm{X}\), solve for sparse \(\mathrm{X}\) and \(\mathrm{X}_i = 0\) or \(1\)
K-Means / K-Median clustering
\([\mathrm{A} = \mathrm{D} \mathrm{X}]\) with unknown and , solve for \(\mathrm{X} \mathrm{X}^{\top} = \mathrm{I}\) and \(\mathrm{X}_i = 0 or 1\)
Subspace clustering
\([\mathrm{A} = \mathrm{A} \mathrm{X}]\) with unknown \(\mathrm{X}\), solve for sparse/other conditions on \(\mathrm{X}\)
Graph Matching
\([\mathrm{A} = \mathrm{X} \mathrm{B} \mathrm{X} ^{\top}]\) with unknown \(\mathrm{X}\), \(\mathrm{B}\) solve for \(\mathrm{B}\) and \(\mathrm{X}\) as a permutation
NMF
\([\mathrm{A} = \mathrm{D} \mathrm{X}]\) with unknown \(\mathrm{D}\) and \(\mathrm{X}\), solve for elements of \(\mathrm{D}\),\(\mathrm{X}\) positive
Generalized Matrix Factorization
\([\mathrm{W}.*\mathrm{L} โˆ’ \mathrm{W}.*\mathrm{U} \mathrm{V}']\) with \(\mathrm{W}\) a known mask, \(\mathrm{U}\),\(\mathrm{V}\) unknowns solve for \(\mathrm{U}\),\(\mathrm{V}\) and \(\mathrm{L}\) lowest rank possible
Matrix Completion
\([\mathrm{A} = \mathrm{H}.*\mathrm{L}]\) with \(\mathrm{H}\) a known mask, \(\mathrm{L}\) unknown solve for \(\mathrm{L}\) lowest rank possible
Stable Principle Component Pursuit (SPCP)/ Noisy Robust PCA
\([\mathrm{A} = \mathrm{L} + \mathrm{S} + \mathrm{N}]\) with \(\mathrm{L}\), \(\mathrm{S}\), \(\mathrm{N}\) unknown, solve for \(\mathrm{L}\) low rank, \(\mathrm{S}\) sparse, \(\mathrm{N}\) noise
Robust PCA
\([\mathrm{A} = \mathrm{L} + \mathrm{S}]\) with , unknown, solve for \(\mathrm{L}\) low rank, \(\mathrm{S}\) sparse
Sparse PCA
\([\mathrm{A} = \mathrm{D} \mathrm{X} ]\) with unknown \(\mathrm{D}\) and \(\mathrm{X}\), solve for sparse \(\mathrm{D}\)
Dictionary Learning
\([\mathrm{A} = \mathrm{D} \mathrm{X}]\) with unknown \(\mathrm{D}\) and \(\mathrm{X}\), solve for sparse \(\mathrm{X}\)
Archetypal Analysis
\([\mathrm{A} = \mathrm{D} \mathrm{X}]\) with unknown \(\mathrm{D}\) and \(\mathrm{X}\), solve for \(\mathrm{D}= \mathrm{A} \mathrm{B}\) with \(\mathrm{D}\), \(\mathrm{B}\) positive
Matrix Compressive Sensing (MCS)
find a rank-r matrix \(\mathrm{L}\) such that \([\mathrm{A}(\mathrm{L}) ~= b]\) / or \([\mathrm{A}(\mathrm{L}+\mathrm{S}) = b]\)
Multiple Measurement Vector (MMV)
\([\mathrm{Y} = \mathrm{A} \mathrm{X}]\) with unknown \(\mathrm{X}\) and rows of \(\mathrm{X}\) are sparse
Compressed sensing
\([\mathrm{Y} = \mathrm{A} \mathrm{X}]\) with unknown \(\mathrm{X}\) and rows of \(\mathrm{X}\) are sparse, \(\mathrm{X}\) is one column.
Blind Source Separation (BSS)
\([\mathrm{Y} = \mathrm{A} \mathrm{X}]\) with unknown \(\mathrm{A}\) and \(\mathrm{X}\) and statistical independence between columns of \(\mathrm{X}\) or subspaces of columns of \(\mathrm{X}\)
Partial and Online SVD/PCA
โ€ฆ
Tensor Decomposition
Many, many options; see tensor decompositions for some tractable ones.

Truncated Classic PCA is clearly also an example, but is excluded from the list for some reason. Boringness? the fact itโ€™s a special case of Sparse PCA?

I also add

Square root
\(\mathrm{Y} = \mathrm{X}^{\top}\mathrm{X}\) for \(\mathrm{Y}\in\mathbb{R}^{N\times N}, X\in\mathbb{R}^{N\times n}\), with (typically) \(n<N\).

That is a whole thing; see matrix square root.

See also learning on manifolds, compressed sensing, optimisation random linear algebra and clustering, penalised regressionโ€ฆ

Tutorials

Non-negative matrix factorisations

See non-negative matrix factorisations.

Why is approximate factorisation ever useful?

For certain types of data matrix, here is a suggestive observation: Udell and Townsend (2019) ask โ€œWhy Are Big Data Matrices Approximately Low Rank?โ€

Matrices of (approximate) low rank are pervasive in data science, appearing in movie preferences, text documents, survey data, medical records, and genomics. While there is a vast literature on how to exploit low rank structure in these datasets, there is less attention paid to explaining why the low rank structure appears in the first place. Here, we explain the effectiveness of low rank models in data science by considering a simple generative model for these matrices: we suppose that each row or column is associated to a (possibly high dimensional) bounded latent variable, and entries of the matrix are generated by applying a piecewise analytic function to these latent variables. These matrices are in general full rank. However, we show that we can approximate every entry of an \(m\times n\) matrix drawn from this model to within a fixed absolute error by a low rank matrix whose rank grows as \(\mathcal{O}(\log(m+n))\). Hence any sufficiently large matrix from such a latent variable model can be approximated, up to a small entrywise error, by a low rank matrix.

Ethan Epperly argues from a function approximation perspective (e.g.) that we can deduce this property from smoothness of functons.

Saul (2023) connects non-negative matrix factorisation to geometric algebra and linear algebra via deep learning and kernels. That sounds like fun.

As regression

Total Least Squares (a.k.a. orthogonal distance regression, or error-in-variables least-squares linear regression) is a low-rank matrix approximation that minimises the Frobenius divergence from the data matrix. Who knew?

Various other dimensionality reduction techniques can be put in a regression framing, notable Exponential-family PCA.

Sketching

โ€œSketchingโ€ is a common term to describe a certain type of low-rank factorisation, although I am not sure which types. ๐Ÿ—

(Martinsson 2016) mentions CUR and interpolative decompositions. What is that now?

Randomization

Rather than find an optimal solution, why not just choose a random one which might be good enough? There are indeed randomised versions, and many algorithms are implemented using randomness and in particular low-dimensional projections.

Connections to kernel learning

See (Grosse et al. 2012) for a mind-melting compositional matrix factorization diagram, constructing a search over hierarchical kernel decompositions that also turn out to have some matrix factorisation interpretations.

Exploiting compositionality to explore a large space of model structures

Bayesian

Nakajima and Sugiyama (2012):

Mnih and Salakhutdinov (2008) proposed a Bayesian maximum a posteriori (MAP) method based on the Gaussian noise model and Gaussian priors on the decomposed matrices. This method actually corresponds to minimizing the squared-loss with the trace-norm penalty (Srebro, Rennie, and Jaakkola 2004) Recently, the variational Bayesian (VB) approach (Attias 1999) has been applied to MF (Lim and Teh 2007; Raiko, Ilin, and Karhunen 2007), which we refer to as VBMF. The VBMF method was shown to perform very well in experiments. However, its good performance was not completely understood beyond its experimental success.

โ˜œ Insert further developments here. Possibly Brouwerโ€™s thesis (Brouwer 2017) or Shakir Mohamedโ€™s (Mohamed 2011) would be a good start, or Benjamin Draveโ€™s tutorial, Probabilistic Matrix Factorization and Xinghao Ding, Lihan He, and Carin (2011).

I am currently sitting in a seminar by He Zhao on Bayesian matrix factorisation, wherein he is building up this tool for discrete data, which is an interesting case. He starts from M. Zhou et al. (2012) and builds up to Zhao et al. (2018), introducing some hierarchical descriptions along the way. His methods seem to be sampling-based rather than variational (?).

Generalizedยฒ Linearยฒ models (Gordon 2002) unify nonlinear matrix factorisations with Generalized Linear Models. I had not heard of that until recently; I wonder how common it is?

Lanczos decomposition

Lanczos decomposition is handy approximation for matrices which are cheap to multiply because of some structure, but expensive to store. It can also be used to calculate an approximate inverse cheaply.

I learnt this trick from Gaussian process literature in the context of Lanczos Variance Estimates (LOVE) (Pleiss et al. 2018), although I believe it exists elsewhere.

Given some rank \(k\) and an arbitrary starting vector \(\boldsymbol{b}\), the Lanczos algorithm iteratively approximates \(\mathrm{K} \in\mathbb{R}^{n \times n}\) by a low rank factorisation \(\mathrm{K}\approx \mathrm{Q} \mathrm{T} \mathrm{Q}^{\top}\), where \(\mathrm{T} \in \mathbb{R}^{k \times k}\) is tridiagonal and \(\mathrm{Q} \in \mathbb{R}^{n \times k}\) has orthogonal columns. Crucially, we do not need to form \(\mathrm{K}\) to evaluate matrix vector products \(\mathrm{K}\boldsymbol{b}\) for arbitrary vector \(\boldsymbol{b}\). Moreover, with a given Lanczos approximand \(\mathrm{Q},\mathrm{T}\) we may estimate \[\begin{align*} \mathrm{K}^{-1}\boldsymbol{c}\approx \mathrm{Q}\mathrm{T}^{-1}\mathrm{Q}^{\top}\boldsymbol{c}. \end{align*}\] even for \(\boldsymbol{b}\neq\boldsymbol{c}\). Say we wish to calculate \(\left(\mathrm{Z} \mathrm{Z}^{\top}+\sigma^2 \mathrm{I}\right)^{-1}\mathrm{B}\), with \(\mathrm{Z}\in\mathbb{R}^{D\times N }\) and \(N\ll D\).

We approximate the solution to this linear system using the partial Lanczos decomposition starting with probe vector \(\boldsymbol{b}=\overline{\mathrm{B}}\) and \(\mathrm{K}=\left(\mathrm{Z} \mathrm{Z}^{\top}+\sigma^2 \mathrm{I}\right)\).

This requires \(k\) matrix vector products of the form \[\begin{align*} \underbrace{\left(\underbrace{\mathrm{Z} \mathrm{Z}^{\top}}_{\mathcal{O}(ND^2)}+\sigma^2 \mathrm{I}\right)\boldsymbol{b}}_{\mathcal{O}(D^2)} =\underbrace{\mathrm{Z} \underbrace{(\mathrm{Z}^{\top}\boldsymbol{b})}_{\mathcal{O}(ND)}}_{\mathcal{O}(ND)} +\sigma^2 \boldsymbol{b}. \end{align*}\] Using the latter representation, the required matrix-vector product may be found with a time complexity cost of \(\mathcal{O}(ND)\). Space complexity is also \(\mathcal{O}(ND)\). The output of the Lanczos decomposition is \(\mathrm{Q},\mathrm{T}\) such that \(\left(\mathrm{Z}\mathrm{Z}^{\top} +\sigma^2 \mathrm{I}\right)\boldsymbol{b}\approx \mathrm{Q} \mathrm{T} \mathrm{Q}^{\top}\boldsymbol{b}\). Then the solution to the inverse-matrix-vector product may be approximated by \(\left(\mathrm{Z} \mathrm{Z}^{\top} +\sigma^2 \mathrm{I}\right)^{-1} \mathrm{B}\approx \mathrm{Q}\mathrm{T}^{-1}\mathrm{Q}^{\top}\mathrm{B}\). requiring the solution in \(\mathrm{X}\) of the much smaller linear system \(\mathrm{X}\mathrm{T}=\mathrm{Q}\). Exploiting the positive-definiteness of \(\mathrm{T}\) we may use the Cholesky decomposition of \(\mathrm{T}=\mathrm{L}^{\top}\mathrm{L}\) for a constant speedup over solving an arbitrary linear system. The time cost of the solution is \(\mathcal{O}(Dk^3)\), for an overall cost to the matrix inversions of \(\mathcal{O}(NDk+Dk^3)\).

Estimating rank

Overview โ€” imate Manual

Eigencount and Numerical Rank

If \(f: \lambda \mapsto \mathbf{1}_{[a, b]}(\lambda)\) is the indicator (step) function in the interval \([a, b]\), then \(\operatorname{trace}(\mathbf{1}(\mathbf{A}))\) estimates the number of non-zero eigenvalues of \(\mathbf{A}\) in that interval, which is an inexpensive method to estimate the rank of a large matrix. Eigencount is closely related to the Principal Component Analysis (PCA) and lowrank approximations in machine learning.

Incremental decompositions

Cholesky

Low rank plus diagonal

Specifically \((\mathrm{K}=\mathrm{Z} \mathrm{Z}^{\top}+\sigma^2\mathrm{I})\) where \(\mathrm{K}\in\mathbb{R}^{N\times N}\) and \(\mathrm{Z}\in\mathbb{R}^{N\times D}\) with \(D\ll N\). A workhorse.

Lots of fun tricks.

See low rank plus diagonal matrices.

As an optimisation problem

There are some generalised optimisation problems which look useful for matrix factorisation, e.g. Bhardwaj, Klep, and Magron (2021):

Polynomial optimization problems (POP) are prevalent in many areas of modern science and engineering. The goal of POP is to minimize a given polynomial over a set defined by finitely many polynomial inequalities, a semialgebraic set. This problem is well known to be NP-hard, and has motivated research for more practical methods to obtain approximate solutions with high accuracy.[โ€ฆ]

One can naturally extend the ideas of positivity and sums of squares to the noncommutative (nc) setting by replacing the commutative variables \(z_1, \dots , z_n\) with noncommuting letters \(x_1, \dots , x_n\). The extension to the noncommutative setting is an inevitable consequence of the many areas of science which regularly optimize functions with noncommuting variables, such as matrices or operators. For instance in control theory, matrix completion, quantum information theory, or quantum chemistry

Matrix calculus can help cometimes.

\([\mathcal{H}]\)-matrix methods

It seems like low-rank matrix factorisation could related to \([\mathcal{H}]\)-matrix methods, but I do not know enough to say more.

See hmatrix.org for one labโ€™s backgrounder and their implementation, h2lib, hlibpro for a black-box closed-source one.

Tools

In pytorch, various operations are made easier with cornellius-gp/linear_operator.

Ameliโ€™s tools:

NMF Toolbox (MATLAB and Python):

Nonnegative matrix factorization (NMF) is a family of methods widely used for information retrieval across domains including text, images, and audio. Within music processing, NMF has been used for tasks such as transcription, source separation, and structure analysis. Prior work has shown that initialization and constrained update rules can drastically improve the chances of NMF converging to a musically meaningful solution. Along these lines we present the NMF toolbox, containing MATLAB and Python implementations of conceptually distinct NMF variantsโ€”in particular, this paper gives an overview for two algorithms. The first variant, called nonnegative matrix factor deconvolution (NMFD), extends the original NMF algorithm to the convolutive case, enforcing the temporal order of spectral templates. The second variant, called diagonal NMF, supports the development of sparse diagonal structures in the activation matrix. Our toolbox contains several demo applications and code examples to illustrate its potential and functionality. By providing MATLAB and Python code on a documentation website under a GNU-GPL license, as well as including illustrative examples, our aim is to foster research and education in the field of music processing.

Vowpal Wabbit factors matrices, e.g for recommender systems. It seems the --qr version is more favoured.

HPC for matlab, R, python, c++: libpmf:

LIBPMF implements the CCD++ algorithm, which aims to solve large-scale matrix factorization problems such as the low-rank factorization problems for recommender systems.

Spams (C++/MATLAB/python) includes some matrix factorisations in its sparse approx toolbox. (see optimisation)

scikit-learn (python) does a few matrix factorisation in its inimitable batteries-in-the-kitchen-sink way.

nimfa

โ€ฆ is a Python library for nonnegative matrix factorization. It includes implementations of several factorization methods, initialization approaches, and quality scoring. Both dense and sparse matrix representation are supported.โ€

Tapkee (C++). Pro-tip โ€” even without coding C++, tapkee does a long list of dimensionality reduction from the CLI.

  • PCA and randomized PCA
  • Kernel PCA (kPCA)
  • Random projection
  • Factor analysis

tensorly supports some interesting tensor decompositions.

References

Aarabi, Hadrien Foroughmand, and Geoffroy Peeters. 2018. โ€œMusic Retiler: Using NMF2D Source Separation for Audio Mosaicing.โ€ In Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion, 27:1โ€“7. AMโ€™18. New York, NY, USA: ACM.
Abdallah, Samer A., and Mark D. Plumbley. 2004. โ€œPolyphonic Music Transcription by Non-Negative Sparse Coding of Power Spectra.โ€ In.
Achlioptas, Dimitris. 2003. โ€œDatabase-Friendly Random Projections: Johnson-Lindenstrauss with Binary Coins.โ€ Journal of Computer and System Sciences, Special Issue on PODS 2001, 66 (4): 671โ€“87.
Aghasi, Alireza, Nam Nguyen, and Justin Romberg. 2016. โ€œNet-Trim: A Layer-Wise Convex Pruning of Deep Neural Networks.โ€ arXiv:1611.05162 [Cs, Stat], November.
Akimoto, Youhei. 2017. โ€œFast Eigen Decomposition for Low-Rank Matrix Approximation.โ€ arXiv.
Ameli, Siavash, and Shawn C. Shadden. 2023. โ€œA Singular Woodbury and Pseudo-Determinant Matrix Identities and Application to Gaussian Process Regression.โ€ Applied Mathematics and Computation 452 (April): 128032.
Ang, Andersen Man Shun, and Nicolas Gillis. 2018. โ€œAccelerating Nonnegative Matrix Factorization Algorithms Using Extrapolation.โ€ Neural Computation 31 (2): 417โ€“39.
Arora, Sanjeev, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, and Michael Zhu. 2012. โ€œA Practical Algorithm for Topic Modeling with Provable Guarantees.โ€ arXiv:1212.4777 [Cs, Stat], December.
Attias, Hagai. 1999. โ€œInferring Parameters and Structure of Latent Variable Models by Variational Bayes.โ€ In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, 21โ€“30. UAIโ€™99. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Babacan, S. Derin, Martin Luessi, Rafael Molina, and Aggelos K. Katsaggelos. 2012. โ€œSparse Bayesian Methods for Low-Rank Matrix Estimation.โ€ IEEE Transactions on Signal Processing 60 (8): 3964โ€“77.
Bach, C, D. Ceglia, L. Song, and F. Duddeck. 2019. โ€œRandomized Low-Rank Approximation Methods for Projection-Based Model Order Reduction of Large Nonlinear Dynamical Problems.โ€ International Journal for Numerical Methods in Engineering 118 (4): 209โ€“41.
Bach, Francis. 2013. โ€œConvex Relaxations of Structured Matrix Factorizations.โ€ arXiv:1309.3117 [Cs, Math], September.
Bach, Francis R. 2013. โ€œSharp Analysis of Low-Rank Kernel Matrix Approximations.โ€ In COLT, 30:185โ€“209.
Bach, Francis R, and Michael I Jordan. 2002. โ€œKernel Independent Component Analysis.โ€ Journal of Machine Learning Research 3 (July): 48.
Bach, Francis, Rodolphe Jenatton, and Julien Mairal. 2011. Optimization With Sparsity-Inducing Penalties. Foundations and Trends(r) in Machine Learning 1.0. Now Publishers Inc.
Bagge Carlson, Fredrik. 2018. โ€œMachine Learning and System Identification for Estimation in Physical Systems.โ€ Thesis/docmono, Lund University.
Barbier, Jean, Nicolas Macris, and Lรฉo Miolane. 2017. โ€œThe Layered Structure of Tensor Estimation and Its Mutual Information.โ€ arXiv:1709.10368 [Cond-Mat, Physics:math-Ph], September.
Bardenet, Rรฉmi, and Odalric-Ambrym Maillard. 2015. โ€œA Note on Replacing Uniform Subsampling by Random Projections in MCMC for Linear Regression of Tall Datasets.โ€
Batson, Joshua, Daniel A. Spielman, and Nikhil Srivastava. 2008. โ€œTwice-Ramanujan Sparsifiers.โ€ arXiv:0808.0163 [Cs], August.
Bauckhage, Christian. 2015. โ€œK-Means Clustering Is Matrix Factorization.โ€ arXiv:1512.07548 [Stat], December.
Berry, Michael W., Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2007. โ€œAlgorithms and Applications for Approximate Nonnegative Matrix Factorization.โ€ Computational Statistics & Data Analysis 52 (1): 155โ€“73.
Bertin, N., R. Badeau, and E. Vincent. 2010. โ€œEnforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription.โ€ IEEE Transactions on Audio, Speech, and Language Processing 18 (3): 538โ€“49.
Bhardwaj, Abhishek, Igor Klep, and Victor Magron. 2021. โ€œNoncommutative Polynomial Optimization.โ€ arXiv.
Brand, Matthew. 2002. โ€œIncremental Singular Value Decomposition of Uncertain Data with Missing Values.โ€ In Computer Vision โ€” ECCV 2002, edited by Anders Heyden, Gunnar Sparr, Mads Nielsen, and Peter Johansen, 2350:707โ€“20. Berlin, Heidelberg: Springer Berlin Heidelberg.
โ€”โ€”โ€”. 2006. โ€œFast Low-Rank Modifications of the Thin Singular Value Decomposition.โ€ Linear Algebra and Its Applications, Special Issue on Large Scale Linear and Nonlinear Eigenvalue Problems, 415 (1): 20โ€“30.
Brouwer, Thomas Alexander. 2017. โ€œBayesian Matrix Factorisation: Inference, Priors, and Data Integration.โ€
Bruckstein, A. M., Michael Elad, and M. Zibulevsky. 2008a. โ€œSparse Non-Negative Solution of a Linear System of Equations Is Unique.โ€ In 3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008, 762โ€“67.
โ€”โ€”โ€”. 2008b. โ€œOn the Uniqueness of Nonnegative Sparse Solutions to Underdetermined Systems of Equations.โ€ IEEE Transactions on Information Theory 54 (11): 4813โ€“20.
Buch, Michael, Elio Quinton, and Bob L Sturm. 2017. โ€œNichtnegativeMatrixFaktorisierungnutzendesKlangsynthesenSystem (NiMFKS): Extensions of NMF-Based Concatenative Sound Synthesis.โ€ In Proceedings of the 20th International Conference on Digital Audio Effects, 7. Edinburgh.
Bunch, James R., and Christopher P. Nielsen. 1978. โ€œUpdating the Singular Value Decomposition.โ€ Numerische Mathematik 31 (2): 111โ€“29.
Caetano, Marcelo, and Xavier Rodet. 2013. โ€œMusical Instrument Sound Morphing Guided by Perceptually Motivated Features.โ€ IEEE Transactions on Audio, Speech, and Language Processing 21 (8): 1666โ€“75.
Cao, Bin, Dou Shen, Jian-Tao Sun, Xuanhui Wang, Qiang Yang, and Zheng Chen. 2007. โ€œDetect and Track Latent Factors with Online Nonnegative Matrix Factorization.โ€ In Proceedings of the 20th International Joint Conference on Artifical Intelligence, 2689โ€“94. IJCAIโ€™07. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Carabias-Orti, J. J., T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes, and F. J. Canadas-Quesada. 2011. โ€œMusical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization.โ€ IEEE Journal of Selected Topics in Signal Processing 5 (6): 1144โ€“58.
Chen, Yudong, and Yuejie Chi. 2018. โ€œHarnessing Structures in Big Data via Guaranteed Low-Rank Matrix Estimation: Recent Theory and Fast Algorithms via Convex and Nonconvex Optimization.โ€ IEEE Signal Processing Magazine 35 (4): 14โ€“31.
Chi, Yuejie, Yue M. Lu, and Yuxin Chen. 2019. โ€œNonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview.โ€ IEEE Transactions on Signal Processing 67 (20): 5239โ€“69.
Cichocki, A., N. Lee, I. V. Oseledets, A.-H. Phan, Q. Zhao, and D. Mandic. 2016. โ€œLow-Rank Tensor Networks for Dimensionality Reduction and Large-Scale Optimization Problems: Perspectives and Challenges PART 1.โ€ arXiv:1609.00893 [Cs], September.
Cichocki, A., R. Zdunek, and S. Amari. 2006. โ€œNew Algorithms for Non-Negative Matrix Factorization in Applications to Blind Source Separation.โ€ In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 5:Vโ€“.
Cohen, Albert, Ingrid Daubechies, and Jean-Christophe Feauveau. 1992. โ€œBiorthogonal Bases of Compactly Supported Wavelets.โ€ Communications on Pure and Applied Mathematics 45 (5): 485โ€“560.
Combettes, Patrick L., and Jean-Christophe Pesquet. 2008. โ€œA Proximal Decomposition Method for Solving Convex Variational.โ€ Inverse Problems 24 (6): 065014.
Dasarathy, Gautam, Parikshit Shah, Badri Narayan Bhaskar, and Robert Nowak. 2013. โ€œSketching Sparse Matrices.โ€ arXiv:1303.6544 [Cs, Math], March.
Dasgupta, Sanjoy, and Anupam Gupta. 2003. โ€œAn Elementary Proof of a Theorem of Johnson and Lindenstrauss.โ€ Random Structures & Algorithms 22 (1): 60โ€“65.
Davis, Chandler. 1963. โ€œThe Rotation of Eigenvectors by a Perturbation.โ€ Journal of Mathematical Analysis and Applications 6 (2): 159โ€“73.
Defferrard, Michaรซl, Xavier Bresson, and Pierre Vandergheynst. 2016. โ€œConvolutional Neural Networks on Graphs with Fast Localized Spectral Filtering.โ€ In Advances In Neural Information Processing Systems.
Del Moral, Pierre, and Angele Niclas. 2018. โ€œA Taylor Expansion of the Square Root Matrix Functional.โ€ arXiv.
Desai, A., M. Ghashami, and J. M. Phillips. 2016. โ€œImproved Practical Matrix Sketching with Guarantees.โ€ IEEE Transactions on Knowledge and Data Engineering 28 (7): 1678โ€“90.
Devarajan, Karthik. 2008. โ€œNonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology.โ€ PLoS Comput Biol 4 (7): e1000029.
Ding, C., X. He, and H. Simon. 2005. โ€œOn the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering.โ€ In Proceedings of the 2005 SIAM International Conference on Data Mining, 606โ€“10. Proceedings. Society for Industrial and Applied Mathematics.
Ding, C., Tao Li, and M.I. Jordan. 2010. โ€œConvex and Semi-Nonnegative Matrix Factorizations.โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (1): 45โ€“55.
Dokmaniฤ‡, Ivan, and Rรฉmi Gribonval. 2017. โ€œBeyond Moore-Penrose Part II: The Sparse Pseudoinverse.โ€ arXiv:1706.08701 [Cs, Math], June.
Dolcetti, Alberto, and Donato Pertici. 2020. โ€œReal Square Roots of Matrices: Differential Properties in Semi-Simple, Symmetric and Orthogonal Cases.โ€ arXiv.
Driedger, Jonathan, and Thomas Pratzlich. 2015. โ€œLet It Bee โ€“ Towards NMF-Inspired Audio Mosaicing.โ€ In Proceedings of ISMIR, 7. Malaga.
Drineas, Petros, Ravi Kannan, and Michael W. Mahoney. 2006a. โ€œFast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix.โ€ SIAM Journal on Computing 36 (1): 158โ€“83.
โ€”โ€”โ€”. 2006b. โ€œFast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix Decomposition.โ€ SIAM Journal on Computing 36 (1): 184โ€“206.
Drineas, Petros, and Michael W. Mahoney. 2005. โ€œOn the Nystrรถm Method for Approximating a Gram Matrix for Improved Kernel-Based Learning.โ€ Journal of Machine Learning Research 6 (December): 2153โ€“75.
Dueck, Delbert, Quaid D. Morris, and Brendan J. Frey. 2005. โ€œMulti-Way Clustering of Microarray Data Using Probabilistic Sparse Matrix Factorization.โ€ Bioinformatics 21 (suppl 1): i144โ€“51.
Eaton, Morris L. 2007a. โ€œChapter 5: Matrix Factorizations and Jacobians.โ€ In Institute of Mathematical Statistics Lecture Notes - Monograph Series, 159โ€“83. Beachwood, Ohio, USA: Institute of Mathematical Statistics.
โ€”โ€”โ€”. 2007b. Multivariate statistics: a vector space approach. Lecture notes-monograph series / Institute of Mathematical Statistics 53. Beachwood, Ohio: Inst. of Mathematical Statistics.
Ellis, Robert L., and David C. Lay. 1992. โ€œFactorization of Finite Rank Hankel and Toeplitz Matrices.โ€ Linear Algebra and Its Applications 173 (August): 19โ€“38.
Fairbanks, James P., Ramakrishnan Kannan, Haesun Park, and David A. Bader. 2015. โ€œBehavioral Clusters in Dynamic Graphs.โ€ Parallel Computing, Graph analysis for scientific discovery, 47 (August): 38โ€“50.
Fasi, Massimiliano, Nicholas J. Higham, and Xiaobo Liu. 2023. โ€œComputing the Square Root of a Low-Rank Perturbation of the Scaled Identity Matrix.โ€ SIAM Journal on Matrix Analysis and Applications 44 (1): 156โ€“74.
Fรฉvotte, Cรฉdric, Nancy Bertin, and Jean-Louis Durrieu. 2008. โ€œNonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis.โ€ Neural Computation 21 (3): 793โ€“830.
Flammia, Steven T., David Gross, Yi-Kai Liu, and Jens Eisert. 2012. โ€œQuantum Tomography via Compressed Sensing: Error Bounds, Sample Complexity, and Efficient Estimators.โ€ New Journal of Physics 14 (9): 095022.
Fung, Wai Shing, Ramesh Hariharan, Nicholas J.A. Harvey, and Debmalya Panigrahi. 2011. โ€œA General Framework for Graph Sparsification.โ€ In Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing, 71โ€“80. STOC โ€™11. New York, NY, USA: ACM.
Gemulla, Rainer, Erik Nijkamp, Peter J. Haas, and Yannis Sismanis. 2011. โ€œLarge-Scale Matrix Factorization with Distributed Stochastic Gradient Descent.โ€ In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 69โ€“77. KDD โ€™11. New York, NY, USA: ACM.
Ghashami, Mina, Edo Liberty, Jeff M. Phillips, and David P. Woodruff. 2015. โ€œFrequent Directions : Simple and Deterministic Matrix Sketching.โ€ arXiv:1501.01711 [Cs], January.
Gordon, Geoffrey J. 2002. โ€œGeneralizedยฒ Linearยฒ Models.โ€ In Proceedings of the 15th International Conference on Neural Information Processing Systems, 593โ€“600. NIPSโ€™02. Cambridge, MA, USA: MIT Press.
Gross, D. 2011. โ€œRecovering Low-Rank Matrices From Few Coefficients in Any Basis.โ€ IEEE Transactions on Information Theory 57 (3): 1548โ€“66.
Gross, David, Yi-Kai Liu, Steven T. Flammia, Stephen Becker, and Jens Eisert. 2010. โ€œQuantum State Tomography via Compressed Sensing.โ€ Physical Review Letters 105 (15).
Grosse, Roger, Ruslan R. Salakhutdinov, William T. Freeman, and Joshua B. Tenenbaum. 2012. โ€œExploiting Compositionality to Explore a Large Space of Model Structures.โ€ In Proceedings of the Conference on Uncertainty in Artificial Intelligence.
Gu, Ming, and Stanley C. Eisenstat. 1993. โ€œA Stable and Fast Algorithm for Updating the Singular Value Decomposition.โ€ Citeseer.
โ€”โ€”โ€”. 1995. โ€œDowndating the Singular Value Decomposition.โ€ SIAM Journal on Matrix Analysis and Applications 16 (3): 793โ€“810.
Guan, Naiyang, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012. โ€œNeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization.โ€ IEEE Transactions on Signal Processing 60 (6): 2882โ€“98.
Guan, N., D. Tao, Z. Luo, and B. Yuan. 2012. โ€œOnline Nonnegative Matrix Factorization With Robust Stochastic Approximation.โ€ IEEE Transactions on Neural Networks and Learning Systems 23 (7): 1087โ€“99.
Hackbusch, Wolfgang. 2015. Hierarchical Matrices: Algorithms and Analysis. 1st ed. Springer Series in Computational Mathematics 49. Heidelberg New York Dordrecht London: Springer Publishing Company, Incorporated.
Halko, Nathan, Per-Gunnar Martinsson, Yoel Shkolnisky, and Mark Tygert. 2011. โ€œAn Algorithm for the Principal Component Analysis of Large Data Sets.โ€ SIAM Journal on Scientific Computing 33 (5): 2580โ€“94.
Halko, Nathan, Per-Gunnar Martinsson, and Joel A. Tropp. 2010. โ€œFinding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions.โ€ arXiv.
Harbrecht, Helmut, Michael Peters, and Reinhold Schneider. 2012. โ€œOn the Low-Rank Approximation by the Pivoted Cholesky Decomposition.โ€ Applied Numerical Mathematics, Third Chilean Workshop on Numerical Analysis of Partial Differential Equations (WONAPDE 2010), 62 (4): 428โ€“40.
Harville, David A. 1976. โ€œExtension of the Gauss-Markov Theorem to Include the Estimation of Random Effects.โ€ The Annals of Statistics 4 (2): 384โ€“95.
โ€”โ€”โ€”. 1977. โ€œMaximum Likelihood Approaches to Variance Component Estimation and to Related Problems.โ€ Journal of the American Statistical Association 72 (358): 320โ€“38.
Hassanieh, Haitham, Piotr Indyk, Dina Katabi, and Eric Price. 2012. โ€œNearly Optimal Sparse Fourier Transform.โ€ In Proceedings of the Forty-Fourth Annual ACM Symposium on Theory of Computing, 563โ€“78. STOC โ€™12. New York, NY, USA: ACM.
Hassanieh, H., P. Indyk, D. Katabi, and E. Price. 2012. โ€œSimple and Practical Algorithm for Sparse Fourier Transform.โ€ In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, 1183โ€“94. Proceedings. Kyoto, Japan: Society for Industrial and Applied Mathematics.
Hastie, Trevor, Rahul Mazumder, Jason D. Lee, and Reza Zadeh. 2015. โ€œMatrix Completion and Low-Rank SVD via Fast Alternating Least Squares.โ€ In Journal of Machine Learning Research, 16:3367โ€“3402.
Heinig, Georg, and Karla Rost. 2011. โ€œFast Algorithms for Toeplitz and Hankel Matrices.โ€ Linear Algebra and Its Applications 435 (1): 1โ€“59.
Henderson, H. V., and S. R. Searle. 1981. โ€œOn Deriving the Inverse of a Sum of Matrices.โ€ SIAM Review 23 (1): 53โ€“60.
Hoffman, Matthew D, David M Blei, and Perry R Cook. 2010. โ€œBayesian Nonparametric Matrix Factorization for Recorded Music.โ€ In International Conference on Machine Learning, 8.
Hoffman, Matthew, Francis R. Bach, and David M. Blei. 2010. โ€œOnline Learning for Latent Dirichlet Allocation.โ€ In Advances in Neural Information Processing Systems, 856โ€“64.
Hoyer, P.O. 2002. โ€œNon-Negative Sparse Coding.โ€ In Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002, 557โ€“65.
Hsieh, Cho-Jui, and Inderjit S. Dhillon. 2011. โ€œFast Coordinate Descent Methods with Variable Selection for Non-Negative Matrix Factorization.โ€ In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1064โ€“72. KDD โ€™11. New York, NY, USA: ACM.
Hu, Tao, Cengiz Pehlevan, and Dmitri B. Chklovskii. 2014. โ€œA Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization.โ€ In 2014 48th Asilomar Conference on Signals, Systems and Computers.
Huang, G., M. Kaess, and J. J. Leonard. 2013. โ€œConsistent Sparsification for Graph Optimization.โ€ In 2013 European Conference on Mobile Robots (ECMR), 150โ€“57.
Huynh, Viet, He Zhao, and Dinh Phung. 2020. โ€œOTLDA: A Geometry-Aware Optimal Transport Approach for Topic Modeling.โ€ In Advances in Neural Information Processing Systems, 33:18573โ€“82. Curran Associates, Inc.
Iliev, Filip L., Valentin G. Stanev, Velimir V. Vesselinov, and Boian S. Alexandrov. 2018. โ€œNonnegative Matrix Factorization for Identification of Unknown Number of Sources Emitting Delayed Signals.โ€ PLOS ONE 13 (3): e0193974.
Ji, S., D. Dunson, and L. Carin. 2009. โ€œMultitask Compressive Sensing.โ€ IEEE Transactions on Signal Processing 57 (1): 92โ€“106.
Kannan, Ramakrishnan. 2016. โ€œScalable and Distributed Constrained Low Rank Approximations,โ€ April.
Kannan, Ramakrishnan, Grey Ballard, and Haesun Park. 2016. โ€œA High-Performance Parallel Algorithm for Nonnegative Matrix Factorization.โ€ In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 9:1โ€“11. PPoPP โ€™16. New York, NY, USA: ACM.
Keriven, Nicolas, Anthony Bourrier, Rรฉmi Gribonval, and Patrick Pรฉrez. 2016. โ€œSketching for Large-Scale Learning of Mixture Models.โ€ In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6190โ€“94.
Keshava, Nirmal. 2003. โ€œA Survey of Spectral Unmixing Algorithms.โ€ Lincoln Laboratory Journal 14 (1): 55โ€“78.
Khoromskij, B. N., A. Litvinenko, and H. G. Matthies. 2009. โ€œApplication of Hierarchical Matrices for Computing the Karhunenโ€“Loรจve Expansion.โ€ Computing 84 (1-2): 49โ€“67.
Kim, H., and H. Park. 2008. โ€œNonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method.โ€ SIAM Journal on Matrix Analysis and Applications 30 (2): 713โ€“30.
Koren, Yehuda, Robert Bell, and Chris Volinsky. 2009. โ€œMatrix Factorization Techniques for Recommender Systems.โ€ Computer 42 (8): 30โ€“37.
Koutis, Ioannis, Gary L. Miller, and Richard Peng. 2012. โ€œA Fast Solver for a Class of Linear Systems.โ€ Communications of the ACM 55 (10): 99โ€“107.
Kruskal, J. B. 1964. โ€œNonmetric Multidimensional Scaling: A Numerical Method.โ€ Psychometrika 29 (2): 115โ€“29.
Kumar, N. Kishore, and Jan Shneider. 2016. โ€œLiterature Survey on Low Rank Approximation of Matrices.โ€ arXiv:1606.06511 [Cs, Math], June.
Lahiri, Subhaneil, Peiran Gao, and Surya Ganguli. 2016. โ€œRandom Projections of Random Manifolds.โ€ arXiv:1607.04331 [Cs, q-Bio, Stat], July.
Lawrence, Neil D., and Raquel Urtasun. 2009. โ€œNon-Linear Matrix Factorization with Gaussian Processes.โ€ In Proceedings of the 26th Annual International Conference on Machine Learning, 601โ€“8. ICML โ€™09. New York, NY, USA: ACM.
Lee, Daniel D., and H. Sebastian Seung. 1999. โ€œLearning the Parts of Objects by Non-Negative Matrix Factorization.โ€ Nature 401 (6755): 788โ€“91.
โ€”โ€”โ€”. 2001. โ€œAlgorithms for Non-Negative Matrix Factorization.โ€ In Advances in Neural Information Processing Systems 13, edited by T. K. Leen, T. G. Dietterich, and V. Tresp, 556โ€“62. MIT Press.
Li, Chi-Kwong, and Edward Poon. 2002. โ€œAdditive Decomposition of Real Matrices.โ€ Linear and Multilinear Algebra 50 (4): 321โ€“26.
Li, S.Z., XinWen Hou, HongJiang Zhang, and Qiansheng Cheng. 2001. โ€œLearning Spatially Localized, Parts-Based Representation.โ€ In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001, 1:I-207-I-212 vol.1.
Liberty, Edo. 2013. โ€œSimple and Deterministic Matrix Sketching.โ€ In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 581โ€“88. KDD โ€™13. New York, NY, USA: ACM.
Liberty, Edo, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert. 2007. โ€œRandomized Algorithms for the Low-Rank Approximation of Matrices.โ€ Proceedings of the National Academy of Sciences 104 (51): 20167โ€“72.
Lim, Yew Jin, and Yee Whye Teh. 2007. โ€œVariational Bayesian Approach to Movie Rating Prediction.โ€ In Proceedings of KDD Cup and Workshop, 7:15โ€“21. Citeseer.
Lin, Chih-Jen. 2007. โ€œProjected Gradient Methods for Nonnegative Matrix Factorization.โ€ Neural Computation 19 (10): 2756โ€“79.
Lin, Zhouchen. 2016. โ€œA Review on Low-Rank Models in Signal and Data Analysis.โ€
Liu, Tongliang, Dacheng Tao, and Dong Xu. 2016. โ€œDimensionality-Dependent Generalization Bounds for \(k\)-Dimensional Coding Schemes.โ€ arXiv:1601.00238 [Cs, Stat], January.
Liu, T., and D. Tao. 2015. โ€œOn the Performance of Manhattan Nonnegative Matrix Factorization.โ€ IEEE Transactions on Neural Networks and Learning Systems PP (99): 1โ€“1.
Lรณpez-Serrano, Patricio, Christian Dittmar, Yigitcan ร–zer, and Meinard Mรผller. 2019. โ€œNMF Toolbox: Music Processing Applications of Nonnegative Matrix Factorization.โ€ In.
Mahoney, Michael W. 2010. Randomized Algorithms for Matrices and Data. Vol. 3.
Mailhรฉ, Boris, Rรฉmi Gribonval, Pierre Vandergheynst, and Frรฉdรฉric Bimbot. 2011. โ€œFast Orthogonal Sparse Approximation Algorithms over Local Dictionaries.โ€ Signal Processing, Advances in Multirate Filter Bank Structures and Multiscale Representations, 91 (12): 2822โ€“35.
Mairal, Julien, Francis Bach, and Jean Ponce. 2014. Sparse Modeling for Image and Vision Processing. Vol. 8.
Mairal, Julien, Francis Bach, Jean Ponce, and Guillermo Sapiro. 2009. โ€œOnline Dictionary Learning for Sparse Coding.โ€ In Proceedings of the 26th Annual International Conference on Machine Learning, 689โ€“96. ICML โ€™09. New York, NY, USA: ACM.
โ€”โ€”โ€”. 2010. โ€œOnline Learning for Matrix Factorization and Sparse Coding.โ€ The Journal of Machine Learning Research 11: 19โ€“60.
Martinsson, Per-Gunnar. 2016. โ€œRandomized Methods for Matrix Computations and Analysis of High Dimensional Data.โ€ arXiv:1607.01649 [Math], July.
Martinsson, Per-Gunnar, Vladimir Rockhlin, and Mark Tygert. 2006. โ€œA Randomized Algorithm for the Approximation of Matrices.โ€ DTIC Document.
Mensch, Arthur, Julien Mairal, Bertrand Thirion, and Gael Varoquaux. 2017. โ€œStochastic Subsampling for Factorizing Huge Matrices.โ€ arXiv:1701.05363 [Math, q-Bio, Stat], January.
Minka, Thomas P. 2000. Old and new matrix algebra useful for statistics.
Mnih, Andriy, and Russ R Salakhutdinov. 2008. โ€œProbabilistic Matrix Factorization.โ€ Advances in Neural Information Processing Systems, 8.
Mohamed, Shakir. 2011. โ€œGeneralised Bayesian Matrix Factorisation Models,โ€ 140.
Nakajima, Shinichi, and Masashi Sugiyama. 2012. โ€œTheoretical Analysis of Bayesian Matrix Factorization.โ€ Journal of Machine Learning Research, 66.
Nakatsukasa, Yuji. 2019. โ€œThe Low-Rank Eigenvalue Problem.โ€ arXiv.
Needell, Deanna, and Roman Vershynin. 2009. โ€œUniform Uncertainty Principle and Signal Recovery via Regularized Orthogonal Matching Pursuit.โ€ Foundations of Computational Mathematics 9 (3): 317โ€“34.
Nowak, W., and A. Litvinenko. 2013. โ€œKriging and Spatial Design Accelerated by Orders of Magnitude: Combining Low-Rank Covariance Approximations with FFT-Techniques.โ€ Mathematical Geosciences 45 (4): 411โ€“35.
Okajima, Koki, and Yoshiyuki Kabashima. 2021. โ€œMatrix Completion Based on Gaussian Parameterized Belief Propagation,โ€ May.
Oymak, Samet, and Joel A. Tropp. 2015. โ€œUniversality Laws for Randomized Dimension Reduction, with Applications.โ€ arXiv:1511.09433 [Cs, Math, Stat], November.
Paatero, Pentti, and Unto Tapper. 1994. โ€œPositive Matrix Factorization: A Non-Negative Factor Model with Optimal Utilization of Error Estimates of Data Values.โ€ Environmetrics 5 (2): 111โ€“26.
Pan, Gang, Wangsheng Zhang, Zhaohui Wu, and Shijian Li. 2014. โ€œOnline Community Detection for Large Complex Networks.โ€ PLoS ONE 9 (7): e102799.
Petersen, Kaare Brandt, and Michael Syskind Pedersen. 2012. โ€œThe Matrix Cookbook.โ€
Pleiss, Geoff, Jacob R. Gardner, Kilian Q. Weinberger, and Andrew Gordon Wilson. 2018. โ€œConstant-Time Predictive Distributions for Gaussian Processes.โ€ In. arXiv.
Pleiss, Geoff, Martin Jankowiak, David Eriksson, Anil Damle, and Jacob Gardner. 2020. โ€œFast Matrix Square Roots with Applications to Gaussian Processes and Bayesian Optimization.โ€ Advances in Neural Information Processing Systems 33.
Rabani, Eran, and Sivan Toledo. 2001. โ€œOut-of-Core SVD and QR Decompositions.โ€ In PPSC.
Raiko, Tapani, Alexander Ilin, and Juha Karhunen. 2007. โ€œPrincipal Component Analysis for Large Scale Problems with Lots of Missing Values.โ€ In Machine Learning: ECML 2007, edited by Joost N. Kok, Jacek Koronacki, Raomon Lopez de Mantaras, Stan Matwin, Dunja Mladeniฤ, and Andrzej Skowron, 4701:691โ€“98. Berlin, Heidelberg: Springer Berlin Heidelberg.
Rokhlin, Vladimir, Arthur Szlam, and Mark Tygert. 2009. โ€œA Randomized Algorithm for Principal Component Analysis.โ€ SIAM J. Matrix Anal. Appl. 31 (3): 1100โ€“1124.
Rokhlin, Vladimir, and Mark Tygert. 2008. โ€œA Fast Randomized Algorithm for Overdetermined Linear Least-Squares Regression.โ€ Proceedings of the National Academy of Sciences 105 (36): 13212โ€“17.
Ryabko, Daniil, and Boris Ryabko. 2010. โ€œNonparametric Statistical Inference for Ergodic Processes.โ€ IEEE Transactions on Information Theory 56 (3): 1430โ€“35.
Saad, Yousef. 2003. Iterative Methods for Sparse Linear Systems: Second Edition. 2nd ed. SIAM.
Sachdeva, Noveen, Mehak Preet Dhaliwal, Carole-Jean Wu, and Julian McAuley. 2022. โ€œInfinite Recommendation Networks: A Data-Centric Approach.โ€ arXiv.
Sachdeva, Sushant. 2013. โ€œFaster Algorithms via Approximation Theory.โ€ Foundations and Trendsยฎ in Theoretical Computer Science 9 (2): 125โ€“210.
Salakhutdinov, Ruslan, and Andriy Mnih. 2008. โ€œBayesian Probabilistic Matrix Factorization Using Markov Chain Monte Carlo.โ€ In Proceedings of the 25th International Conference on Machine Learning, 880โ€“87. ICML โ€™08. New York, NY, USA: ACM.
Sarwar, Badrul, George Karypis, Joseph Konstan, and John Riedl. 2002. โ€œIncremental Singular Value Decomposition Algorithms for Highly Scalable Recommender Systems.โ€
Saul, Lawrence K. 2023. โ€œA Geometrical Connection Between Sparse and Low-Rank Matrices and Its Application to Manifold Learning.โ€ Transactions on Machine Learning Research, January.
Schmidt, M.N., J. Larsen, and Fu-Tien Hsiao. 2007. โ€œWind Noise Reduction Using Non-Negative Sparse Coding.โ€ In 2007 IEEE Workshop on Machine Learning for Signal Processing, 431โ€“36.
Seeger, Matthias, ed. 2004. Low Rank Updates for the Cholesky Decomposition.
Seshadhri, C., Aneesh Sharma, Andrew Stolman, and Ashish Goel. 2020. โ€œThe Impossibility of Low-Rank Representations for Triangle-Rich Complex Networks.โ€ Proceedings of the National Academy of Sciences 117 (11): 5631โ€“37.
Shi, Jiarong, Xiuyun Zheng, and Wei Yang. 2017. โ€œSurvey on Probabilistic Models of Low-Rank Matrix Factorizations.โ€ Entropy 19 (8): 424.
Singh, Ajit P., and Geoffrey J. Gordon. 2008. โ€œA Unified View of Matrix Factorization Models.โ€ In Machine Learning and Knowledge Discovery in Databases, 358โ€“73. Springer.
Smaragdis, Paris. 2004. โ€œNon-Negative Matrix Factor Deconvolution; Extraction of Multiple Sound Sources from Monophonic Inputs.โ€ In Independent Component Analysis and Blind Signal Separation, edited by Carlos G. Puntonet and Alberto Prieto, 494โ€“99. Lecture Notes in Computer Science. Granada, Spain: Springer Berlin Heidelberg.
Soh, Yong Sheng, and Venkat Chandrasekaran. 2017. โ€œA Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers.โ€ arXiv:1701.01207 [Cs, Math, Stat], January.
Song, Yue, Nicu Sebe, and Wei Wang. 2022. โ€œFast Differentiable Matrix Square Root.โ€ In.
Sorzano, C. O. S., J. Vargas, and A. Pascual Montano. 2014. โ€œA Survey of Dimensionality Reduction Techniques.โ€ arXiv:1403.2877 [Cs, q-Bio, Stat], March.
Spielman, Daniel A., and Shang-Hua Teng. 2004. โ€œNearly-Linear Time Algorithms for Graph Partitioning, Graph Sparsification, and Solving Linear Systems.โ€ In Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing, 81โ€“90. STOC โ€™04. New York, NY, USA: ACM.
โ€”โ€”โ€”. 2006. โ€œNearly-Linear Time Algorithms for Preconditioning and Solving Symmetric, Diagonally Dominant Linear Systems.โ€ arXiv:cs/0607105, July.
โ€”โ€”โ€”. 2008a. โ€œSpectral Sparsification of Graphs.โ€ arXiv:0808.4134 [Cs], August.
โ€”โ€”โ€”. 2008b. โ€œA Local Clustering Algorithm for Massive Graphs and Its Application to Nearly-Linear Time Graph Partitioning.โ€ arXiv:0809.3232 [Cs], September.
Spielman, D., and N. Srivastava. 2011. โ€œGraph Sparsification by Effective Resistances.โ€ SIAM Journal on Computing 40 (6): 1913โ€“26.
Sra, Suvrit, and Inderjit S. Dhillon. 2006. โ€œGeneralized Nonnegative Matrix Approximations with Bregman Divergences.โ€ In Advances in Neural Information Processing Systems 18, edited by Y. Weiss, B. Schรถlkopf, and J. C. Platt, 283โ€“90. MIT Press.
Srebro, Nathan, Jason D. M. Rennie, and Tommi S. Jaakkola. 2004. โ€œMaximum-Margin Matrix Factorization.โ€ In Advances in Neural Information Processing Systems, 17:1329โ€“36. NIPSโ€™04. Cambridge, MA, USA: MIT Press.
Stewart, G.W. 2000. โ€œThe Decompositional Approach to Matrix Computation.โ€ Computing in Science Engineering 2 (1): 50โ€“59.
Sun, Ying, and Michael L. Stein. 2016. โ€œStatistically and Computationally Efficient Estimating Equations for Large Spatial Datasets.โ€ Journal of Computational and Graphical Statistics 25 (1): 187โ€“208.
Sundin, Martin. 2016. โ€œBayesian Methods for Sparse and Low-Rank Matrix Problems.โ€ PhD Thesis, KTH Royal Institute of Technology.
Tropp, Joel A., Alp Yurtsever, Madeleine Udell, and Volkan Cevher. 2016. โ€œRandomized Single-View Algorithms for Low-Rank Matrix Approximation.โ€ arXiv:1609.00048 [Cs, Math, Stat], August.
โ€”โ€”โ€”. 2017. โ€œPractical Sketching Algorithms for Low-Rank Matrix Approximation.โ€ SIAM Journal on Matrix Analysis and Applications 38 (4): 1454โ€“85.
Tufts, D. W., and R. Kumaresan. 1982. โ€œEstimation of Frequencies of Multiple Sinusoids: Making Linear Prediction Perform Like Maximum Likelihood.โ€ Proceedings of the IEEE 70 (9): 975โ€“89.
Tung, Frederick, and James J. Little. n.d. โ€œFactorized Binary Codes for Large-Scale Nearest Neighbor Search.โ€
Tรผrkmen, Ali Caner. 2015. โ€œA Review of Nonnegative Matrix Factorization Methods for Clustering.โ€ arXiv:1507.03194 [Cs, Stat], July.
Turner, Richard E., and Maneesh Sahani. 2014. โ€œTime-Frequency Analysis as Probabilistic Inference.โ€ IEEE Transactions on Signal Processing 62 (23): 6171โ€“83.
Udell, M., and A. Townsend. 2019. โ€œWhy Are Big Data Matrices Approximately Low Rank?โ€ SIAM Journal on Mathematics of Data Science 1 (1): 144โ€“60.
Vaz, Colin, Asterios Toutios, and Shrikanth S. Narayanan. 2016. โ€œConvex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data.โ€ In, 963โ€“67.
Vincent, E., N. Bertin, and R. Badeau. 2008. โ€œHarmonic and Inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch Transcription.โ€ In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 109โ€“12.
Virtanen, T. 2007. โ€œMonaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria.โ€ IEEE Transactions on Audio, Speech, and Language Processing 15 (3): 1066โ€“74.
Vishnoi, Nisheeth K. 2013. โ€œLx = b.โ€ Foundations and Trendsยฎ in Theoretical Computer Science 8 (1-2): 1โ€“141.
Vo, Ba Ngu, Ba Tuong Vo, and Hung Gia Hoang. 2017. โ€œAn Efficient Implementation of the Generalized Labeled Multi-Bernoulli Filter.โ€ arXiv:1606.08350 [Stat], February.
Wager, S., L. Chen, M. Kim, and C. Raphael. 2017. โ€œTowards Expressive Instrument Synthesis Through Smooth Frame-by-Frame Reconstruction: From String to Woodwind.โ€ In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 391โ€“95.
Wang, Boyue, Yongli Hu, Junbin Gao, Yanfeng Sun, Haoran Chen, and Baocai Yin. 2017. โ€œLocality Preserving Projections for Grassmann Manifold.โ€ In PRoceedings of IJCAI, 2017.
Wang, Shusen, Alex Gittens, and Michael W. Mahoney. 2017. โ€œSketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging.โ€ arXiv:1702.04837 [Cs, Stat], February.
Wang, Sinong, Belinda Z. Li, Madian Khabsa, Han Fang, and Hao Ma. 2020. โ€œLinformer: Self-Attention with Linear Complexity.โ€ arXiv.
Wang, Y. X., and Y. J. Zhang. 2013. โ€œNonnegative Matrix Factorization: A Comprehensive Review.โ€ IEEE Transactions on Knowledge and Data Engineering 25 (6): 1336โ€“53.
Wang, Yuan, and Yunde Jia. 2004. โ€œFisher Non-Negative Matrix Factorization for Learning Local Features.โ€ In In Proc. Asian Conf. On Comp. Vision, 27โ€“30.
Wilkinson, William J., Michael Riis Andersen, Joshua D. Reiss, Dan Stowell, and Arno Solin. 2019. โ€œEnd-to-End Probabilistic Inference for Nonstationary Audio Analysis.โ€ arXiv:1901.11436 [Cs, Eess, Stat], January.
Woodruff, David P. 2014. Sketching as a Tool for Numerical Linear Algebra. Foundations and Trends in Theoretical Computer Science 1.0. Now Publishers.
Woolfe, Franco, Edo Liberty, Vladimir Rokhlin, and Mark Tygert. 2008. โ€œA Fast Randomized Algorithm for the Approximation of Matrices.โ€ Applied and Computational Harmonic Analysis 25 (3): 335โ€“66.
Wright, John, and Yi Ma. 2022. High-dimensional data analysis with low-dimensional models: Principles, computation, and applications. S.l.: Cambridge University Press.
Xinghao Ding, Lihan He, and L. Carin. 2011. โ€œBayesian Robust Principal Component Analysis.โ€ IEEE Transactions on Image Processing 20 (12): 3419โ€“30.
Yang, Jiyan, Xiangrui Meng, and Michael W. Mahoney. 2015. โ€œImplementing Randomized Matrix Algorithms in Parallel and Distributed Environments.โ€ arXiv:1502.03032 [Cs, Math, Stat], February.
Yang, Linxiao, Jun Fang, Huiping Duan, Hongbin Li, and Bing Zeng. 2018. โ€œFast Low-Rank Bayesian Matrix Completion with Hierarchical Gaussian Prior Models.โ€ IEEE Transactions on Signal Processing 66 (11): 2804โ€“17.
Yang, Wenzhuo, and Huan Xu. 2015. โ€œStreaming Sparse Principal Component Analysis.โ€ In Journal of Machine Learning Research, 494โ€“503.
Ye, Ke, and Lek-Heng Lim. 2016. โ€œEvery Matrix Is a Product of Toeplitz Matrices.โ€ Foundations of Computational Mathematics 16 (3): 577โ€“98.
Yin, M., J. Gao, and Z. Lin. 2016. โ€œLaplacian Regularized Low-Rank Representation and Its Applications.โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence 38 (3): 504โ€“17.
Yoshii, Kazuyoshi. 2013. โ€œBeyond NMF: Time-Domain Audio Source Separation Without Phase Reconstruction,โ€ 6.
Yu, Chenhan D., William B. March, and George Biros. 2017. โ€œAn \(N \log N\) Parallel Fast Direct Solver for Kernel Matrices.โ€ In arXiv:1701.02324 [Cs].
Yu, Hsiang-Fu, Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon. 2012. โ€œScalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems.โ€ In IEEE International Conference of Data Mining, 765โ€“74.
โ€”โ€”โ€”. 2014. โ€œParallel Matrix Factorization for Recommender Systems.โ€ Knowledge and Information Systems 41 (3): 793โ€“819.
Yurtsever, Alp, Joel A. Tropp, Olivier Fercoq, Madeleine Udell, and Volkan Cevher. 2021. โ€œScalable Semidefinite Programming.โ€ SIAM Journal on Mathematics of Data Science 3 (1): 171โ€“200.
Zass, Ron, and Amnon Shashua. 2005. โ€œA Unifying Approach to Hard and Probabilistic Clustering.โ€ In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCVโ€™05) Volume 1 - Volume 01, 294โ€“301. ICCV โ€™05. Washington, DC, USA: IEEE Computer Society.
Zhang, Kai, Chuanren Liu, Jie Zhang, Hui Xiong, Eric Xing, and Jieping Ye. 2017. โ€œRandomization or Condensation?: Linear-Cost Matrix Sketching Via Cascaded Compression Sampling.โ€ In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 615โ€“23. KDD โ€™17. New York, NY, USA: ACM.
Zhang, Xiao, Lingxiao Wang, and Quanquan Gu. 2017. โ€œStochastic Variance-Reduced Gradient Descent for Low-Rank Matrix Recovery from Linear Measurements.โ€ arXiv:1701.00481 [Stat], January.
Zhang, Yangwen. 2022. โ€œAn Answer to an Open Question in the Incremental SVD.โ€ arXiv.
Zhang, Zhongyuan, Chris Ding, Tao Li, and Xiangsun Zhang. 2007. โ€œBinary Matrix Factorization with Applications.โ€ In Seventh IEEE International Conference on Data Mining, 2007. ICDM 2007, 391โ€“400. IEEE.
Zhao, He, Lan Du, Wray Buntine, and Mingyuan Zhou. 2018. โ€œInter and Intra Topic Structure Learning with Word Embeddings.โ€ In Proceedings of the 35th International Conference on Machine Learning, 5892โ€“5901. PMLR.
Zhao, He, Dinh Phung, Viet Huynh, Trung Le, and Wray Buntine. 2020. โ€œNeural Topic Model via Optimal Transport.โ€ In.
Zhou, Mingyuan, Haojun Chen, John Paisley, Lu Ren, Guillermo Sapiro, and Lawrence Carin. 2009. โ€œNon-Parametric Bayesian Dictionary Learning for Sparse Image Representations.โ€ In Proceedings of the 22nd International Conference on Neural Information Processing Systems, 22:2295โ€“2303. NIPSโ€™09. Red Hook, NY, USA: Curran Associates Inc.
Zhou, Mingyuan, Lauren Hannah, David Dunson, and Lawrence Carin. 2012. โ€œBeta-Negative Binomial Process and Poisson Factor Analysis.โ€ In Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 1462โ€“71. PMLR.
Zhou, Tianyi, and Dacheng Tao. 2011. โ€œGoDec: Randomized Low-Rank & Sparse Matrix Decomposition in Noisy Case.โ€ In Proceedings of the 28th International Conference on International Conference on Machine Learning, 33โ€“40. ICMLโ€™11. Madison, WI, USA: Omnipress.
โ€”โ€”โ€”. 2012. โ€œMulti-Label Subspace Ensemble.โ€ Journal of Machine Learning Research.
Zitnik, Marinka, and Blaz Zupan. 2018. โ€œNIMFA: A Python Library for Nonnegative Matrix Factorization.โ€ arXiv:1808.01743 [Cs, q-Bio, Stat], August.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.