# Non-negative matrix factorisation

A cute hack in the world of sparse matrix factorisation, where the goal is to decode an element-wise non-negative matrix into a product of two smaller matrices, which looks a lot like sparse coding if you squint at it.

David Austin gives a simple introduction, to the classic Non-negative matrix factorization for the American Mathematical Society.

This method is famous for decomposing things into parts in a sparse way using $$l_2$$ loss. It can be made even sparser by exploring other losses. 🏗

Aarabi, Hadrien Foroughmand, and Geoffroy Peeters. 2018. “Music Retiler: Using NMF2D Source Separation for Audio Mosaicing.” In Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion, 27:1–27:7. AM’18. New York, NY, USA: ACM. https://doi.org/10.1145/3243274.3243299.

Abdallah, Samer A., and Mark D. Plumbley. 2004. “Polyphonic Music Transcription by Non-Negative Sparse Coding of Power Spectra.” In. http://ismir2004.ismir.net/proceedings/p058-page-318-paper216.pdf.

Arora, Sanjeev, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, and Michael Zhu. 2012. “A Practical Algorithm for Topic Modeling with Provable Guarantees,” December. http://arxiv.org/abs/1212.4777.

Baladi, Viviane. 2000. Positive Transfer Operators and Decay of Correlations. Advanced Series in Nonlinear Dynamics, v. 16. Singapore ; River Edge, NJ: World Scientific Pub. Co.

Berry, Michael W., Murray Browne, Amy N. Langville, V. Paul Pauca, and Robert J. Plemmons. 2007. “Algorithms and Applications for Approximate Nonnegative Matrix Factorization.” Computational Statistics & Data Analysis 52 (1): 155–73. https://doi.org/10.1016/j.csda.2006.11.006.

Bertin, N., R. Badeau, and E. Vincent. 2010. “Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription.” IEEE Transactions on Audio, Speech, and Language Processing 18 (3): 538–49. https://doi.org/10.1109/TASL.2010.2041381.

Bruckstein, A. M., Michael Elad, and M. Zibulevsky. 2008. “Sparse Non-Negative Solution of a Linear System of Equations Is Unique.” In 3rd International Symposium on Communications, Control and Signal Processing, 2008. ISCCSP 2008, 762–67. https://doi.org/10.1109/ISCCSP.2008.4537325.

Buch, Michael, Elio Quinton, and Bob L Sturm. 2017. “NichtnegativeMatrixFaktorisierungnutzendesKlangsynthesenSystem (NiMFKS): Extensions of NMF-Based Concatenative Sound Synthesis.” In Proceedings of the 20th International Conference on Digital Audio Effects, 7. Edinburgh.

Cao, Bin, Dou Shen, Jian-Tao Sun, Xuanhui Wang, Qiang Yang, and Zheng Chen. n.d. “Detect and Track Latent Factors with Online Nonnegative Matrix Factorization.” In. Accessed May 31, 2015. https://er2004.cse.ust.hk/~qyang/Docs/2007/ijcai07.Cao.pdf.

Carabias-Orti, J. J., T. Virtanen, P. Vera-Candeas, N. Ruiz-Reyes, and F. J. Canadas-Quesada. 2011. “Musical Instrument Sound Multi-Excitation Model for Non-Negative Spectrogram Factorization.” IEEE Journal of Selected Topics in Signal Processing 5 (6): 1144–58. https://doi.org/10.1109/JSTSP.2011.2159700.

Cemgil, Ali Taylan. 2009. “Bayesian Inference for Nonnegative Matrix Factorisation Models.” Computational Intelligence and Neuroscience. https://doi.org/10.1155/2009/785152.

Cichocki, A., R. Zdunek, and S. Amari. 2006. “New Algorithms for Non-Negative Matrix Factorization in Applications to Blind Source Separation.” In 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 5:V–V. https://doi.org/10.1109/ICASSP.2006.1661352.

Damle, Anil, and Yuekai Sun. 2014. “A Geometric Approach to Archetypal Analysis and Non-Negative Matrix Factorization,” May. http://arxiv.org/abs/1405.4275.

Devarajan, Karthik. 2008. “Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology.” PLoS Comput Biol 4 (7): e1000029. https://doi.org/10.1371/journal.pcbi.1000029.

Ding, C., X. He, and H. Simon. 2005. “On the Equivalence of Nonnegative Matrix Factorization and Spectral Clustering.” In Proceedings of the 2005 SIAM International Conference on Data Mining, 606–10. Proceedings. Society for Industrial and Applied Mathematics. http://ranger.uta.edu/~chqding/papers/NMF-SDM2005.pdf.

Ding, Chris, Tao Li, and Wei Peng. 2008. “On the Equivalence Between Non-Negative Matrix Factorization and Probabilistic Latent Semantic Indexing.” Computational Statistics & Data Analysis 52 (8): 3913–27. https://doi.org/10.1016/j.csda.2008.01.011.

Ding, C., Tao Li, and M. I. Jordan. 2010. “Convex and Semi-Nonnegative Matrix Factorizations.” IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (1): 45–55. https://doi.org/10.1109/TPAMI.2008.277.

Ding, Jiu, and Aihui Zhou. 2009. “Elementary Properties of Non-Negative Matrices.” In Nonnegative Matrices, Positive Operators, and Applications. Hackensack, N.J: World Scientific. https://doi.org/10.1142/7197.

Driedger, Jonathan, and Thomas Pratzlich. 2015. “Let It Bee – Towards NMF-Inspired Audio Mosaicing.” In Proceedings of ISMIR, 7. Malaga. http://ismir2015.uma.es/articles/13_Paper.pdf.

Drineas, Petros, and Michael W. Mahoney. 2005. “On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning.” Journal of Machine Learning Research 6 (December): 2153–75. http://jmlr.org/papers/volume6/drineas05a/drineas05a.pdf.

Dueck, Delbert, Quaid D. Morris, and Brendan J. Frey. 2005. “Multi-Way Clustering of Microarray Data Using Probabilistic Sparse Matrix Factorization.” Bioinformatics 21 (suppl 1): i144–i151. https://doi.org/10.1093/bioinformatics/bti1041.

Eggert, J., and E. Korner. 2004. “Sparse Coding and NMF.” In 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541), 4:2529–33 vol.4. https://doi.org/10.1109/IJCNN.2004.1381036.

Fairbanks, James P., Ramakrishnan Kannan, Haesun Park, and David A. Bader. 2015. “Behavioral Clusters in Dynamic Graphs.” Parallel Computing, Graph analysis for scientific discovery, 47 (August): 38–50. https://doi.org/10.1016/j.parco.2015.03.002.

Févotte, Cédric, Nancy Bertin, and Jean-Louis Durrieu. 2008. “Nonnegative Matrix Factorization with the Itakura-Saito Divergence: With Application to Music Analysis.” Neural Computation 21 (3): 793–830. https://doi.org/10.1162/neco.2008.04-08-771.

Févotte, Cédric, and Jérôme Idier. 2011. “Algorithms for Nonnegative Matrix Factorization with the β-Divergence.” Neural Computation 23 (9): 2421–56. https://doi.org/10.1162/NECO_a_00168.

Gemulla, Rainer, Erik Nijkamp, Peter J. Haas, and Yannis Sismanis. 2011. “Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent.” In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 69–77. KDD ’11. New York, NY, USA: ACM. https://doi.org/10.1145/2020408.2020426.

Guan, Naiyang, Dacheng Tao, Zhigang Luo, and Bo Yuan. 2012. “NeNMF: An Optimal Gradient Method for Nonnegative Matrix Factorization.” IEEE Transactions on Signal Processing 60 (6): 2882–98. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6166359.

Guan, N., D. Tao, Z. Luo, and B. Yuan. 2012. “Online Nonnegative Matrix Factorization with Robust Stochastic Approximation.” IEEE Transactions on Neural Networks and Learning Systems 23 (7): 1087–99. https://doi.org/10.1109/TNNLS.2012.2197827.

Halko, Nathan, Per-Gunnar Martinsson, and Joel A. Tropp. 2009. “Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions,” September. http://arxiv.org/abs/0909.4061.

Hoffman, Matthew D, David M Blei, and Perry R Cook. 2010. “Bayesian Nonparametric Matrix Factorization for Recorded Music.” In International Conference on Machine Learning, 8. http://soundlab.cs.princeton.edu/publications/2010_icml_gapnmf.pdf.

Hoyer, Patrik O. n.d. “Non-Negative Matrix Factorization with Sparseness Constraints.” Journal of Machine Learning Research 5 (9): 1457–69. Accessed October 10, 2014. http://arxiv.org/abs/cs/0408058.

Hoyer, P. O. 2002. “Non-Negative Sparse Coding.” In Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002, 557–65. https://doi.org/10.1109/NNSP.2002.1030067.

Hsieh, Cho-Jui, and Inderjit S. Dhillon. 2011. “Fast Coordinate Descent Methods with Variable Selection for Non-Negative Matrix Factorization.” In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1064–72. KDD ’11. New York, NY, USA: ACM. https://doi.org/10.1145/2020408.2020577.

Hu, Tao, Cengiz Pehlevan, and Dmitri B. Chklovskii. 2014. “A Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization.” In 2014 48th Asilomar Conference on Signals, Systems and Computers. https://doi.org/10.1109/ACSSC.2014.7094519.

Kannan, Ramakrishnan. 2016. “Scalable and Distributed Constrained Low Rank Approximations,” April. https://smartech.gatech.edu/handle/1853/54962.

Kannan, Ramakrishnan, Grey Ballard, and Haesun Park. 2016. “A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization.” In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 9:1–9:11. PPoPP ’16. New York, NY, USA: ACM. https://doi.org/10.1145/2851141.2851152.

Kim, H., and H. Park. 2008. “Nonnegative Matrix Factorization Based on Alternating Nonnegativity Constrained Least Squares and Active Set Method.” SIAM Journal on Matrix Analysis and Applications 30 (2): 713–30. https://doi.org/10.1137/07069239X.

Koren, Yehuda, Robert Bell, and Chris Volinsky. 2009. “Matrix Factorization Techniques for Recommender Systems.” Computer 42 (8): 30–37. https://doi.org/10.1109/MC.2009.263.

Lee, Daniel D., and H. Sebastian Seung. 2001. “Algorithms for Non-Negative Matrix Factorization.” In Advances in Neural Information Processing Systems 13, edited by T. K. Leen, T. G. Dietterich, and V. Tresp, 556–62. MIT Press. http://papers.nips.cc/paper/1861-algorithms-for-non-negative-matrix-factorization.pdf.

———. 1999. “Learning the Parts of Objects by Non-Negative Matrix Factorization.” Nature 401 (6755): 788–91. https://doi.org/10.1038/44565.

Li, S. Z., XinWen Hou, HongJiang Zhang, and Qiansheng Cheng. 2001. “Learning Spatially Localized, Parts-Based Representation.” In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001. CVPR 2001, 1:I–207–I–212vol.1. https://doi.org/10.1109/CVPR.2001.990477.

Li, Yuanzhi, Yingyu Liang, and Andrej Risteski. 2016. “Recovery Guarantee of Non-Negative Matrix Factorization via Alternating Updates.” In Advances in Neural Information Processing Systems 29, edited by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, 4988–96. Curran Associates, Inc. http://papers.nips.cc/paper/6417-recovery-guarantee-of-non-negative-matrix-factorization-via-alternating-updates.pdf.

Liberty, Edo, Franco Woolfe, Per-Gunnar Martinsson, Vladimir Rokhlin, and Mark Tygert. 2007. “Randomized Algorithms for the Low-Rank Approximation of Matrices.” Proceedings of the National Academy of Sciences 104 (51): 20167–72. https://doi.org/10.1073/pnas.0709640104.

Lin, Chih-Jen. 2007. “Projected Gradient Methods for Nonnegative Matrix Factorization.” Neural Computation 19 (10): 2756–79. https://doi.org/10.1162/neco.2007.19.10.2756.

Liu, T., and D. Tao. 2015. “On the Performance of Manhattan Nonnegative Matrix Factorization.” IEEE Transactions on Neural Networks and Learning Systems PP (99): 1–1. https://doi.org/10.1109/TNNLS.2015.2458986.

Martinsson, Per-Gunnar. 2016. “Randomized Methods for Matrix Computations and Analysis of High Dimensional Data,” July. http://arxiv.org/abs/1607.01649.

Martinsson, Per-Gunnar, Vladimir Rockhlin, and Mark Tygert. 2006. “A Randomized Algorithm for the Approximation of Matrices.” DTIC Document. http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA458932.

Masood, M. Arjumand, and Finale Doshi-Velez. 2016. “Rapid Posterior Exploration in Bayesian Non-Negative Matrix Factorization,” October. http://arxiv.org/abs/1610.08928.

Nonnegative Matrices in the Mathematical Sciences. 1979. Elsevier. https://doi.org/10.1016/C2013-0-10361-3.

Paatero, Pentti, and Unto Tapper. 1994. “Positive Matrix Factorization: A Non-Negative Factor Model with Optimal Utilization of Error Estimates of Data Values.” Environmetrics 5 (2): 111–26. https://doi.org/10.1002/env.3170050203.

Rokhlin, Vladimir, Arthur Szlam, and Mark Tygert. 2009. “A Randomized Algorithm for Principal Component Analysis.” SIAM J. Matrix Anal. Appl. 31 (3): 1100–1124. https://doi.org/10.1137/080736417.

Rokhlin, Vladimir, and Mark Tygert. 2008. “A Fast Randomized Algorithm for Overdetermined Linear Least-Squares Regression.” Proceedings of the National Academy of Sciences 105 (36): 13212–7. https://doi.org/10.1073/pnas.0804869105.

Salakhutdinov, Ruslan, and Andriy Mnih. 2008. “Bayesian Probabilistic Matrix Factorization Using Markov Chain Monte Carlo.” In Proceedings of the 25th International Conference on Machine Learning, 880–87. ICML ’08. New York, NY, USA: ACM. https://doi.org/10.1145/1390156.1390267.

Schmidt, M. N., J. Larsen, and Fu-Tien Hsiao. 2007. “Wind Noise Reduction Using Non-Negative Sparse Coding.” In 2007 IEEE Workshop on Machine Learning for Signal Processing, 431–36. https://doi.org/10.1109/MLSP.2007.4414345.

Sra, Suvrit, and Inderjit S. Dhillon. 2006. “Generalized Nonnegative Matrix Approximations with Bregman Divergences.” In Advances in Neural Information Processing Systems 18, edited by Y. Weiss, B. Schölkopf, and J. C. Platt, 283–90. MIT Press. http://papers.nips.cc/paper/2757-generalized-nonnegative-matrix-approximations-with-bregman-divergences.pdf.

Tepper, Mariano, and Guillermo Sapiro. 2016. “Compressed Nonnegative Matrix Factorization Is Fast and Accurate.” IEEE Transactions on Signal Processing 64 (9): 2269–83. https://doi.org/10.1109/TSP.2016.2516971.

Turner, Richard E., and Maneesh Sahani. 2014. “Time-Frequency Analysis as Probabilistic Inference.” IEEE Transactions on Signal Processing 62 (23): 6171–83. https://doi.org/10.1109/TSP.2014.2362100.

Türkmen, Ali Caner. 2015. “A Review of Nonnegative Matrix Factorization Methods for Clustering,” July. http://arxiv.org/abs/1507.03194.

Vaz, Colin, Asterios Toutios, and Shrikanth S. Narayanan. 2016. “Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data.” In, 963–67. https://doi.org/10.21437/Interspeech.2016-571.

Vincent, E., N. Bertin, and R. Badeau. 2008. “Harmonic and Inharmonic Nonnegative Matrix Factorization for Polyphonic Pitch Transcription.” In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 109–12. https://doi.org/10.1109/ICASSP.2008.4517558.

Virtanen, T. 2007. “Monaural Sound Source Separation by Nonnegative Matrix Factorization with Temporal Continuity and Sparseness Criteria.” IEEE Transactions on Audio, Speech, and Language Processing 15 (3): 1066–74. https://doi.org/10.1109/TASL.2006.885253.

Virtanen, T., A. Taylan Cemgil, and S. Godsill. 2008. “Bayesian Extensions to Non-Negative Matrix Factorisation for Audio Signal Modelling.” In 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 1825–8. https://doi.org/10.1109/ICASSP.2008.4517987.

Wang, Fei, and Ping Li. 2010. “Efficient Nonnegative Matrix Factorization with Random Projections.” In SDM, 281–92. SIAM. https://doi.org/10.1137/1.9781611972801.25.

Wang, Yuan, and Yunde Jia. 2004. “Fisher Non-Negative Matrix Factorization for Learning Local Features.” In In Proc. Asian Conf. On Comp. Vision, 27–30.

Wang, Y. X., and Y. J. Zhang. 2013. “Nonnegative Matrix Factorization: A Comprehensive Review.” IEEE Transactions on Knowledge and Data Engineering 25 (6): 1336–53. https://doi.org/10.1109/TKDE.2012.51.

Wenwu Wang. 2011. “Instantaneous Versus Convolutive Non-Negative Matrix Factorization: Models, Algorithms and Applications to Audio Pattern Separation.” In Machine Audition: Principles, Algorithms and Systems, 353–70. Hershey, PA, USA: IGI Global. http://services.igi-global.com/resolvedoi/resolve.aspx?doi=10.4018/978-1-61520-919-4.ch015.

Witten, Rafi, and Emmanuel Candès. 2015. “Randomized Algorithms for Low-Rank Matrix Factorizations: Sharp Performance Bounds.” Algorithmica 72 (1): 264–81. https://doi.org/10.1007/s00453-014-9891-7.

Woolfe, Franco, Edo Liberty, Vladimir Rokhlin, and Mark Tygert. 2008. “A Fast Randomized Algorithm for the Approximation of Matrices.” Applied and Computational Harmonic Analysis 25 (3): 335–66. https://doi.org/10.1016/j.acha.2007.12.002.

Yu, Hsiang-Fu, Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon. 2012. “Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems.” In IEEE International Conference of Data Mining, 765–74. https://doi.org/10.1109/ICDM.2012.168.

———. 2014. “Parallel Matrix Factorization for Recommender Systems.” Knowledge and Information Systems 41 (3): 793–819. https://doi.org/10.1007/s10115-013-0682-2.

Zass, Ron, and Amnon Shashua. 2005. “A Unifying Approach to Hard and Probabilistic Clustering.” In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1 - Volume 01, 294–301. ICCV ’05. Washington, DC, USA: IEEE Computer Society. https://doi.org/10.1109/ICCV.2005.27.

Zhang, Zhongyuan, Chris Ding, Tao Li, and Xiangsun Zhang. 2007. “Binary Matrix Factorization with Applications.” In Seventh IEEE International Conference on Data Mining, 2007. ICDM 2007, 391–400. IEEE. https://doi.org/10.1109/ICDM.2007.99.