Getting a bunch of data points and approximating them (in some sense) by their membership (possibly fuzzy) in some groups, or regions of feature space.

For certain definitions this can be the same thing as
non-negative and/or low rank
matrix factorisations if you use mixture models,
and is only really different in *emphasis* from
dimensionality reduction.
If you start with a list of features then think about “distances” between
observations you have just implicitly intuced a weighted graph from your hitherto
non-graphy data and are now looking at a
networks problem.

If you care about clustering as such, spectral clustering flike a nice entry point. Here is Chris Ding’s tutorial on spectral clustering

`CONCOR`

induces a cute similarity measure.`MCL`

: Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs.

There are many useful tricks in here, e.g. Belkin and Niyogi (2003) shows how to use a graph Laplacian (possibly a contrived or arbitrary one) to construct “natural” Euclidean coordinates for your data, such that nodes that have much traffic between them in the Laplacian representation have a small Euclidean distance (The “Urban Traffic Planner Fantasy Transformation”) Quickly gives you a similarity measure on non-Euclidean data. Questions: Under which metrics is it equivalent to multidimensional scaling? Is it worthwhile going the other way and constructing density estimates from induced flow graphs?

## Clustering as matrix factorisation

If I know me, I might be looking at this page trying remember which papers situate k-means-type clustering in matrix factorisation literature.

The single-serve paper doing that is Bauckhage (2015), but there are broader versions (Singh and Gordon 2008; Türkmen 2015), some computer science connections in Mixon, Villar, and Ward (2016), and an older one in Zass and Shashua (2005).

Further things I might discuss here are the graph-flow/Laplacian notions of clustering and the density/centroids approach. I will discuss that under mixture models

## References

*arXiv:1507.05910 [Cs, Stat]*, July.

*Journal of Machine Learning Research*7 (Oct): 1963–2001.

*arXiv:0808.0163 [Cs]*, August.

*arXiv:1512.07548 [Stat]*, December.

*Neural Computation*15 (6): 1373–96.

*Physical Review E*70 (6): 066111.

*Proceedings of the 2005 SIAM International Conference on Data Mining*, 606–10. Proceedings. Society for Industrial and Applied Mathematics.

*Proceedings of the National Academy of Sciences*100 (10): 5591–96.

*Bioinformatics*21 (suppl 1): i144–51.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*35 (11): 2765–81.

*Proceedings of the Forty-Third Annual ACM Symposium on Theory of Computing*, 71–80. STOC ’11. New York, NY, USA: ACM.

*arXiv:1507.00280 [Cs, Math, Stat]*, July.

*Proceedings of the 16th International Conference on Neural Information Processing Systems*, 16:153–60. NIPS’03. Cambridge, MA, USA: MIT Press.

*2013 European Conference on Mobile Robots (ECMR)*, 150–57.

*Knowledge and Information Systems*8 (2): 154–77.

*arXiv:1612.03281 [Cond-Mat, Physics:physics]*, December.

*arXiv:1602.06612 [Cs, Math, Stat]*, February.

*The Annals of Applied Statistics*7 (3): 1525–39.

*The European Physical Journal B - Condensed Matter and Complex Systems*38 (2): 321–30.

*SIAM Journal on Optimization*18 (1): 186–205.

*arXiv:1608.07597 [Stat]*, August.

*arXiv:1612.06470 [Cs, Stat]*, December.

*Computer Science Review*1 (1): 27–64.

*Mustererkennung 1998*, edited by Paul Levi, Michael Schanz, Rolf-Jürgen Ahlers, and Franz May, 125–32. Informatik Aktuell. Springer Berlin Heidelberg.

*Journal of Machine Learning Research*.

*Machine Learning and Knowledge Discovery in Databases*, 358–73. Springer.

*Proceedings of the National Academy of Sciences of the United States of America*102: 18297–302.

*Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing*, 81–90. STOC ’04. New York, NY, USA: ACM.

*arXiv:0809.3232 [Cs]*, September.

*SIAM Journal on Computing*40 (6): 1913–26.

*Cognitive Science*29 (1): 41–78.

*arXiv:1612.03450 [Cs, Math, Stat]*, December.

*arXiv:1507.03194 [Cs, Stat]*, July.

*Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 907–16. KDD ’09. New York, NY, USA: ACM.

*Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1 - Volume 01*, 294–301. ICCV ’05. Washington, DC, USA: IEEE Computer Society.

*Seventh IEEE International Conference on Data Mining, 2007. ICDM 2007*, 391–400. IEEE.

## No comments yet. Why not leave one?