πππππ

I will restructure learning on manifolds and dimensionality reduction into a more useful distinction.

As in β handling your high-dimensional, or graphical, data by trying to discover a low(er)-dimensional manifold that contains it. That is, inferring a hidden constraint that happens to have the form of a smooth surface of some low-ish dimension. related: Learning on manifolds, and if you squint at it, learnable indexes.

There are a million different versions of this. Multidimensional scaling seems to be the oldest.

Tangential aside: in dynamical systems we talk about creating high dimensional Takens embedding for state space reconstruction for arbitrary nonlinear dynamics. I imagine there are some connections between learning the lower-dimensional manifold upon which lies your data, and the higher dimensional manifold in which your dataβs state space is naturally expressed. But I would not be the first person to notice this, so hopefully itβs done for me somewhere?

See also kernel methods, and functional regression, which connects via diffusion maps (Ronald R. Coifman and Lafon 2006; R. R. Coifman et al. 2005, 2005).

See also information geometry, which uses manifolds also but one implied by the parameterisation of a parametric model.

To look at: ISOMAP, Locally linear embedding, spectral embeddings, diffusion maps, multidimensional scalingβ¦

Bioinformatics is leading to some weird use of data manifolds; see for example Berger, Daniels, and Yu (2016) for the performance implications of knowing the manifold shape for *-omics search, using compressive manifold storage based on both fractal dimension and metric entropy concepts. Also suggestive connection with fitness landscape in evolution.

Neural networks have some implicit manifolds, if you squint right. see Christopher Olahsβs visual explanation how, whose diagrams should be stolen by someone trying to explain V-C dimension.

Berger, Daniels, and Yu (2016) argue:

Manifold learning algorithms have recently played a crucial role in unsupervised learning tasks such as clustering and nonlinear dimensionality reduction [β¦] Many such algorithms have been shown to be equivalent to Kernel PCA (KPCA) with data dependent kernels, itself equivalent to performing classical multidimensional scaling (cMDS) in a high dimensional feature space (SchΓΆlkopf et al., 1998; Williams, 2002; Bengio et al., 2004). [β¦] Recently, it has been observed that the majority of manifold learning algorithms can be expressed as a regularized loss minimization of a reconstruction matrix, followed by a singular value truncation (Neufeld et al., 2012)

## Implementations

### TTK

The Topology ToolKit (TTK) is an open-source library and software collection for topological data analysis in scientific visualization.

TTK can handle scalar data defined either on regular grids or triangulations, either in 2D or in 3D. It provides a substantial collection of generic, efficient and robust implementations of key algorithms in topological data analysis. It includes:

For scalar data: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smale complexes, topological simplification;

For bivariate scalar data: fibers, fiber surfaces, continuous scatterplots, Jacobi sets, Reeb spaces;

For uncertain scalar data: mandatory critical points;

### scikit-learn

scikit-learn implements a grab-bag of algorithms

### tapkee

C++: Tapkee. Pro-tip β even without coding, tapkee does a long list of nice dimensionality reduction from the CLI, some of which are explicitly manifold learners (and the rest are matrix factorisations which is not so different)

- Locally Linear Embedding and Kernel Locally Linear Embedding (LLE/KLLE)
- Neighborhood Preserving Embedding (NPE)
- Local Tangent Space Alignment (LTSA)
- Linear Local Tangent Space Alignment (LLTSA)
- Hessian Locally Linear Embedding (HLLE)
- Laplacian eigenmaps
- Locality Preserving Projections
- Diffusion map
- Isomap and landmark Isomap
- Multidimensional scaling and landmark Multidimensional scaling (MDS/lMDS)
- Stochastic Proximity Embedding (SPE)
- PCA and randomized PCA
- Kernel PCA (kPCA)
- t-SNE
- Barnes-Hut-SNE

## References

*International Conference on Machine Learning*, 214β23.

*Physical Review E*86 (3): 036109.

*The Annals of Statistics*39 (1): 48β81.

*Neural Computation*15 (6): 1373β96.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*35: 1798β828.

*Communications of the ACM*59 (8): 72β80.

*International Journal of Computer Vision*76 (1): 1β12.

*Nature Computational Science*2 (7): 433β42.

*IEEE Transactions on Signal Processing*58 (12): 6140β55.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*42 (8): 1942β56.

*Proceedings of the National Academy of Sciences*102 (21): 7426β31.

*Applied and Computational Harmonic Analysis*, Special Issue: Diffusion Maps and Wavelets, 21 (1): 5β30.

*Acta Numerica*7 (January): 51β150.

*The Annals of Statistics*12 (3): 793β815.

*The Annals of Statistics*14 (1): 1β26.

*Proceedings of the National Academy of Sciences*100 (10): 5591β96.

*Advances in Neural Information Processing Systems*, 473β80.

*Connection Science*24 (1): 57β69.

*2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition*, 2:1735β42.

*The Annals of Statistics*21 (2): 867β89.

*IEEE Transactions on Image Processing*22 (6): 2138β50.

*Proceedings of the 16th International Conference on Neural Information Processing Systems*, 16:153β60. NIPSβ03. Cambridge, MA, USA: MIT Press.

*The Annals of Statistics*38 (4): 2465β98.

*Proceedings of the National Academy of Sciences*105 (31): 10687β92.

*arXiv:1607.04331 [Cs, q-Bio, Stat]*, July.

*Journal of Machine Learning Research*9 (Nov): 2579β2605.

*Journal of Machine Learning Research*, 341β56.

*Bernoulli*16 (1): 181β207.

*Science*290 (5500): 2323β26.

*The Journal of Machine Learning Research*4 (December): 119β55.

*Artificial Neural Networks β ICANNβ97*, edited by Wulfram Gerstner, Alain Germond, Martin Hasler, and Jean-Daniel Nicoud, 583β88. Lecture Notes in Computer Science. Springer Berlin Heidelberg.

*Neural Computation*10 (5): 1299β319.

*Proceedings of the 26th Annual International Conference on Machine Learning*, 937β44. ICML β09. New York, NY, USA: ACM.

*Proceedings of the National Academy of Sciences*108 (41): 16916β21.

*Computational Learning Theory*, edited by Paul Fischer and Hans Ulrich Simon, 214β29. Lecture Notes in Computer Science 1572. Springer Berlin Heidelberg.

*IEEE transactions on image processing: a publication of the IEEE Signal Processing Society*19 (1): 174β84.

*Advances in Neural Information Processing Systems 21*, 1561β68. Curran Associates, Inc.

*Science*, 2000.

*PRoceedings of IJCAI, 2017*.

*Proceedings of the Twenty-First International Conference on Machine Learning*, 106β6. ICML β04. New York, NY, USA: ACM.

*Advances in Neural Information Processing Systems 13*, edited by T. K. Leen, T. G. Dietterich, and V. Tresp, 46:675β81. MIT Press.

*The Journal of Machine Learning Research*11: 2175β98.

*IEEE Transactions on Pattern Analysis and Machine Intelligence*38 (3): 504β17.

*ICML 2012*.

*Data Mining and Knowledge Discovery*22 (3): 340β71.

*Proceedings of European Conference on Computer Vision*.

## No comments yet. Why not leave one?