Statistics, computational complexity thereof



Statistical inference when computation isn’t free and memory is not infinite; what does this tell us about the learnable?

Lots of statistics is already concerned with complexity scaling already; in Gaussian process regression, for example, we care a lot about avoiding naïve methods that scale as \(\mathcal{O}(N^3)\). In optimisation we care greatly about the trade off between approach to the optimum and number of steps.

There is a converse question which is how good a model can you fit, holding a computational/memory budget fixed? General results of this flavour are what I might collate here.

For now there are some fun ones about large-models.

References

Bossaerts, Peter, Nitin Yadav, and Carsten Murawski. 2019. “Uncertainty and Computational Complexity.” Philosophical Transactions of the Royal Society B: Biological Sciences 374 (1766): 20180138. https://doi.org/10.1098/rstb.2018.0138.
Frey, B.J., and Nebojsa Jojic. 2005. “A Comparison of Algorithms for Inference and Learning in Probabilistic Graphical Models.” IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (9): 1392–1416. https://doi.org/10.1109/TPAMI.2005.169.
Jordan, Michael I., and Yair Weiss. 2002. “Probabilistic Inference in Graphical Models.” Handbook of Neural Networks and Brain Theory. http://mlg.eng.cam.ac.uk/zoubin/course03/hbtnn2e-I.pdf.
Moshkovitz, Dana, and Michal Moshkovitz. 2017. “Mixing Implies Lower Bounds for Space Bounded Learning.” In Conference on Learning Theory, 1516–66. http://proceedings.mlr.press/v65/moshkovitz17a.html.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.