Minimum description length


A formalisation of Occam’s razor of some kind. I see it invoked in Bayes model selection.

Barron, Andrew R. 1991. “Complexity Regularization with Application to Artificial Neural Networks.” In Nonparametric Functional Estimation and Related Topics, edited by George Roussas, 561–76. NATO ASI Series 335. Springer Netherlands. https://doi.org/10.1007/978-94-011-3222-0_42.

Barron, A. R., and T. M. Cover. 1991. “Minimum Complexity Density Estimation.” IEEE Transactions on Information Theory 37 (4): 1034–54. https://doi.org/10.1109/18.86996.

Barron, A., J. Rissanen, and Bin Yu. 1998. “The Minimum Description Length Principle in Coding and Modeling.” IEEE Transactions on Information Theory 44 (6): 2743–60. https://doi.org/10.1109/18.720554.

Grünwald, Peter. 1996. “A Minimum Description Length Approach to Grammar Inference.” In Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, 1040:203–16. Lecture Notes in Computer Science. London, UK, UK: Springer-Verlag. http://dl.acm.org/citation.cfm?id=646314.688520.

Grünwald, Peter D. 2007. The Minimum Description Length Principle. Cambridge, Mass.: MIT Press.

———. 2004. “A Tutorial Introduction to the Minimum Description Length Principle.” Advances in Minimum Description Length: Theory and Applications, June, 23–81. http://arxiv.org/abs/math/0406077.

Hansen, Mark H., and Bin Yu. 2001. “Model Selection and the Principle of Minimum Description Length.” Journal of the American Statistical Association 96 (454): 746–74. http://www.jstor.org/stable/2670311.

Legg, Shane. 2006. “Is There an Elegant Universal Theory of Prediction?” In Algorithmic Learning Theory, edited by José L. Balcázar, Philip M. Long, and Frank Stephan, 274–87. Lecture Notes in Computer Science 4264. Springer Berlin Heidelberg. http://link.springer.com/chapter/10.1007/11894841_23.

Mavromatis, Panayotis. 2009. “Minimum Description Length Modelling of Musical Structure.” Journal of Mathematics and Music 3 (3): 117–36. https://doi.org/10.1080/17459730903313122.

Rissanen, J. 1984. “Universal Coding, Information, Prediction, and Estimation.” IEEE Transactions on Information Theory 30 (4): 629–36. https://doi.org/10.1109/TIT.1984.1056936.

Solomonoff, R. J. 1964a. “A Formal Theory of Inductive Inference. Part I.” Information and Control 7 (1): 1–22. https://doi.org/10.1016/S0019-9958(64)90223-2.

———. 1964b. “A Formal Theory of Inductive Inference. Part II.” Information and Control 7 (2): 224–54. https://doi.org/10.1016/S0019-9958(64)90131-7.

Sterkenburg, Tom F. 2016. “Solomonoff Prediction and Occam’s Razor.” Philosophy of Science 83 (4): 459–79. https://doi.org/10.1086/687257.

Vitányi, Paul M. 2006. “Meaningful Information.” IEEE Transactions on Information Theory 52 (10): 4617–26. https://doi.org/10.1109/TIT.2006.881729.