Probabilistic graphical models



Judea Pearl performing graph surgery

The term graphical model is in the context of statistics means a particular thing: a family of ways to relate inference in multivariate models in terms of Calculating marginal and conditional probabilities in terms of graphs that describe how probabilities factorise, where graph here is in the sense of network theory, i.e. a collection of nodes connected by edges. Here are some of those

Loeligerโ€™s 2004 zoo of the predominant graphical models

It turns out that switching back and forth between these different formalisms makes some things easier to do and if you are luck also easier to understand. Within this area, there are several specialties and a lot of material. This is a landing page pointing to actual content.

Thematically, content to this theme is scattered across graphical models in inference, learning graphs from data, diagramming graphical models, learning causation from data plus graphs, quantum graphical models, and yet more pages.

Barber (2012)โ€™s Taxonomy of graphical models

Introductory texts {##intro-texts}

Barber (2012) and Steffen L. Lauritzen (1996) are rigorous introductions. Murphy (2012) has a minimal introduction intermixed with particular related methods, so takes you straight to applications, although personally I found that confusing. For use in causality, Pearl (2009) and Spirtes, Glymour, and Scheines (2001) are readable.

People recommend me Koller and Friedman (2009) which is probably the most comprehensive, but I found it too comprehensive, to the point it was hard to see the forest for the trees. Maybe better as thing to deepen your understanding when you already know what is going on.

What are plates?

Invented in Buntine (1994), the plate notation is how we introduce the notion of multiple variables with a regular, i.e. conditionally independent relation to existing variables. These are extremely important if you want to observe more than one data point. AFAICT, really digging deep into data as just another node is what makes Koller and Friedman (2009) a classic text book. But it is really glossed over in lots of papers, especially early ones, where the problem of estimating parameters from data is glossed over.

Concretely, consider Dustin Tranโ€™s example of the kind of dimensionality we are working with:

Dustin Tran: A hierarchical model, with latent variables \(\alpha_k\) defined locally per group and latent variables \(\phi\) defined globally to be shared across groups.

We are motivated by hierarchical models[โ€ฆ]. Formally, let \(y_{n k}\) be the \(n^{t h}\) data point in group \(k\), with a total of \(N_{k}\) data points in group \(k\) and \(K\) many groups. We model the data using local latent variables \(\alpha_{k}\) associated to a group \(k\), and using global latent variables \(\phi\) which are shared across groups.[โ€ฆ] (Figure) The posterior distribution of local variables \(\alpha_{k}\) and global variables \(\phi\) is \[ p(\alpha, \phi \mid \mathbf{y}) \propto p(\phi \mid \mathbf{y}) \prod_{k=1}^{K}\left[p\left(\alpha_{k} \mid \beta\right) \prod_{n=1}^{N_{K}} p\left(y_{n k} \mid \alpha_{k}, \phi\right)\right] \] The benefit of distributed updates over the independent factors is immediate. For example, suppose the data consists of 1,000 data points per group (with 5,000 groups); we model it with 2 latent variables per group and 20 global latent variables.

How many dimensions are we integrating over now? 10,020. The number of intermediate dimensions in our data grows very rapidly as we add observations, even if the final marginal is of low dimension. However, some of those dimensions are independent of others, and so may be factored away.

Directed graphs

Graphs of conditional, directed independence are a convenient formalism for many models. These are also called Bayes nets presumably because the relationships encoded in these graphs have utility in the automatic application of Bayes rule. These models have a natural interpretation in terms of causation and structural models. See directed graphical models.

Undirected, a.k.a. Markov graphs

a.k.a Markov random fields, Markov random networksโ€ฆ These have a natural interpretation in terms of energy-based models. See undirected graphical models.

Factor graphs

A unifying formalism for the directed and undirected graphical models. Simpler in some ways, harder in others. See factor graphs.

Inference on

A key use of this graphical structure is that it can make inference local, in that you can have different compute nodes, which examine part of the data/part of the model and pass messages back and forth to do inference over the entire thing. It is easy to say this, but making practical an performant algorithms this way isโ€ฆ well, it is a whole field. See graphical models in inference,

Implementations

All of the probabilistic programming languages end up needing to account for graphical model structure in practice, so maybe start there.

References

Altun, Yasemin, Alex J. Smola, and Thomas Hofmann. 2004. โ€œExponential Families for Conditional Random Fields.โ€ In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, 2โ€“9. UAI โ€™04. Arlington, Virginia, United States: AUAI Press.
Aragam, Bryon, Jiaying Gu, and Qing Zhou. 2017. โ€œLearning Large-Scale Bayesian Networks with the Sparsebn Package.โ€ arXiv:1703.04025 [Cs, Stat], March.
Aragam, Bryon, and Qing Zhou. 2015. โ€œConcave Penalized Estimation of Sparse Gaussian Bayesian Networks.โ€ Journal of Machine Learning Research 16: 2273โ€“2328.
Aral, Sinan, Lev Muchnik, and Arun Sundararajan. 2009. โ€œDistinguishing Influence-Based Contagion from Homophily-Driven Diffusion in Dynamic Networks.โ€ Proceedings of the National Academy of Sciences 106 (51): 21544โ€“49.
Arnold, Barry C., Enrique Castillo, and Jose M. Sarabia. 1999. Conditional Specification of Statistical Models. Springer Science & Business Media.
Baddeley, A. J., and Marie-Colette NM Van Lieshout. 1995. โ€œArea-Interaction Point Processes.โ€ Annals of the Institute of Statistical Mathematics 47 (4): 601โ€“19.
Baddeley, A. J., Marie-Colette NM Van Lieshout, and J. Mรธller. 1996. โ€œMarkov Properties of Cluster Processes.โ€ Advances in Applied Probability 28 (2): 346โ€“55.
Baddeley, Adrian J, Jesper Mรธller, and Rasmus Plenge Waagepetersen. 2000. โ€œNon- and Semi-Parametric Estimation of Interaction in Inhomogeneous Point Patterns.โ€ Statistica Neerlandica 54 (3): 329โ€“50.
Baddeley, Adrian, and Jesper Mรธller. 1989. โ€œNearest-Neighbour Markov Point Processes and Random Sets.โ€ International Statistical Review / Revue Internationale de Statistique 57 (2): 89โ€“121.
Barber, David. 2012. Bayesian Reasoning and Machine Learning. Cambridge ; New York: Cambridge University Press.
Bareinboim, Elias, Jin Tian, and Judea Pearl. 2014. โ€œRecovering from Selection Bias in Causal and Statistical Inference.โ€ In AAAI, 2410โ€“16.
Bartolucci, Francesco, and Julian Besag. 2002. โ€œA Recursive Algorithm for Markov Random Fields.โ€ Biometrika 89 (3): 724โ€“30.
Besag, Julian. 1974. โ€œSpatial Interaction and the Statistical Analysis of Lattice Systems.โ€ Journal of the Royal Statistical Society. Series B (Methodological) 36 (2): 192โ€“236.
โ€”โ€”โ€”. 1975. โ€œStatistical Analysis of Non-Lattice Data.โ€ Journal of the Royal Statistical Society. Series D (The Statistician) 24 (3): 179โ€“95.
โ€”โ€”โ€”. 1986. โ€œOn the Statistical Analysis of Dirty Pictures.โ€ Journal of the Royal Statistical Society. Series B (Methodological) 48 (3): 259โ€“302.
Bishop, Christopher M. 2006. Pattern Recognition and Machine Learning. Information Science and Statistics. New York: Springer.
Blake, Andrew, Pushmeet Kohli, and Carsten Rother, eds. 2011. Markov Random Fields for Vision and Image Processing. Cambridge, Mass: MIT Press.
Bloniarz, Adam, Hanzhong Liu, Cun-Hui Zhang, Jasjeet Sekhon, and Bin Yu. 2015. โ€œLasso Adjustments of Treatment Effect Estimates in Randomized Experiments.โ€ arXiv:1507.03652 [Math, Stat], July.
Boyd, Stephen. 2010. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Vol. 3. Now Publishers Inc.
Brodersen, Kay H., Fabian Gallusser, Jim Koehler, Nicolas Remy, and Steven L. Scott. 2015. โ€œInferring Causal Impact Using Bayesian Structural Time-Series Models.โ€ The Annals of Applied Statistics 9 (1): 247โ€“74.
Bu, Yunqi, and Johannes Lederer. 2017. โ€œIntegrating Additional Knowledge Into Estimation of Graphical Models.โ€ arXiv:1704.02739 [Stat], April.
Bรผhlmann, Peter, Markus Kalisch, and Lukas Meier. 2014. โ€œHigh-Dimensional Statistics with a View Toward Applications in Biology.โ€ Annual Review of Statistics and Its Application 1 (1): 255โ€“78.
Bรผhlmann, Peter, Philipp Rรผtimann, and Markus Kalisch. 2013. โ€œControlling False Positive Selections in High-Dimensional Regression and Causal Inference.โ€ Statistical Methods in Medical Research 22 (5): 466โ€“92.
Buntine, W. L. 1994. โ€œOperations for Learning with Graphical Models.โ€ Journal of Artificial Intelligence Research 2 (1): 159โ€“225.
Celeux, Gilles, Florence Forbes, and Nathalie Peyrard. 2003. โ€œEM Procedures Using Mean Field-Like Approximations for Markov Model-Based Image Segmentation.โ€ Pattern Recognition 36 (1): 131โ€“44.
Cevher, Volkan, Marco F. Duarte, Chinmay Hegde, and Richard Baraniuk. 2009. โ€œSparse Signal Recovery Using Markov Random Fields.โ€ In Advances in Neural Information Processing Systems, 257โ€“64. Curran Associates, Inc.
Charniak, Eugene. 1991. โ€œBayesian Networks Without Tears.โ€ AI Magazine 12 (4): 50.
Christakis, Nicholas A., and James H. Fowler. 2007. โ€œThe Spread of Obesity in a Large Social Network over 32 Years.โ€ New England Journal of Medicine 357 (4): 370โ€“79.
Clifford, P. 1990. โ€œMarkov random fields in statistics.โ€ In Disorder in Physical Systems: A Volume in Honour of John Hammersley, edited by G. R. Grimmett and D. J. A. Welsh. Oxford England : New York: Oxford University Press.
Crisan, Dan, and Joaquรญn Mรญguez. 2014. โ€œParticle-Kernel Estimation of the Filter Density in State-Space Models.โ€ Bernoulli 20 (4): 1879โ€“929.
Da Costa, Lancelot, Karl Friston, Conor Heins, and Grigorios A. Pavliotis. 2021. โ€œBayesian Mechanics for Stationary Processes.โ€ arXiv:2106.13830 [Math-Ph, Physics:nlin, q-Bio], June.
Dawid, A. P. 2001. โ€œSeparoids: A Mathematical Framework for Conditional Independence and Irrelevance.โ€ Annals of Mathematics and Artificial Intelligence 32 (1-4): 335โ€“72.
Dawid, A. Philip. 1979. โ€œConditional Independence in Statistical Theory.โ€ Journal of the Royal Statistical Society. Series B (Methodological) 41 (1): 1โ€“31.
โ€”โ€”โ€”. 1980. โ€œConditional Independence for Statistical Operations.โ€ The Annals of Statistics 8 (3): 598โ€“617.
De Luna, Xavier, Ingeborg Waernbaum, and Thomas S. Richardson. 2011. โ€œCovariate Selection for the Nonparametric Estimation of an Average Treatment Effect.โ€ Biometrika, October, asr041.
Edwards, David, and Smitha Ankinakatte. 2015. โ€œContext-Specific Graphical Models for Discrete Longitudinal Data.โ€ Statistical Modelling 15 (4): 301โ€“25.
Fixx, James F. 1977. Games for the superintelligent. London: Muller.
Forbes, F., and N. Peyrard. 2003. โ€œHidden Markov Random Field Model Selection Criteria Based on Mean Field-Like Approximations.โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (9): 1089โ€“1101.
Frey, B.J., and Nebojsa Jojic. 2005. โ€œA Comparison of Algorithms for Inference and Learning in Probabilistic Graphical Models.โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (9): 1392โ€“1416.
Frey, Brendan J. 2003. โ€œExtending Factor Graphs so as to Unify Directed and Undirected Graphical Models.โ€ In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence, 257โ€“64. UAIโ€™03. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Fridman, Arthur. 2003. โ€œMixed Markov Models.โ€ Proceedings of the National Academy of Sciences 100 (14): 8092โ€“96.
Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2008. โ€œSparse Inverse Covariance Estimation with the Graphical Lasso.โ€ Biostatistics 9 (3): 432โ€“41.
Friel, Nial, and Hรฅvard Rue. 2007. โ€œRecursive Computing and Simulation-Free Inference for General Factorizable Models.โ€ Biometrika 94 (3): 661โ€“72.
Geyer, Charles J. 1991. โ€œMarkov Chain Monte Carlo Maximum Likelihood.โ€
Geyer, Charles J., and Jesper Mรธller. 1994. โ€œSimulation Procedures and Likelihood Inference for Spatial Point Processes.โ€ Scandinavian Journal of Statistics, 359โ€“73.
Goldberg, David A. 2013. โ€œHigher Order Markov Random Fields for Independent Sets.โ€ arXiv:1301.1762 [Math-Ph], January.
Grenander, Ulf. 1989. โ€œAdvances in Pattern Theory.โ€ The Annals of Statistics 17 (1): 1โ€“30.
Griffeath, David. 1976. โ€œIntroduction to Random Fields.โ€ In Denumerable Markov Chains, 425โ€“58. Graduate Texts in Mathematics 40. Springer New York.
Gu, Jiaying, Fei Fu, and Qing Zhou. 2014. โ€œAdaptive Penalized Estimation of Directed Acyclic Graphs From Categorical Data.โ€ arXiv:1403.2310 [Stat], March.
Hรคggstrรถm, Olle, Marie-Colette N. M. van Lieshout, and Jesper Mรธller. 1999. โ€œCharacterization Results and Markov Chain Monte Carlo Algorithms Including Exact Simulation for Some Spatial Point Processes.โ€ Bernoulli 5 (4): 641โ€“58.
Heckerman, David, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite, and Carl Kadie. 2000. โ€œDependency Networks for Inference, Collaborative Filtering, and Data Visualization.โ€ Journal of Machine Learning Research 1 (Oct): 49โ€“75.
Jensen, Jens Ledet, and Jesper Mรธller. 1991. โ€œPseudolikelihood for Exponential Family Models of Spatial Point Processes.โ€ The Annals of Applied Probability 1 (3): 445โ€“61.
Jordan, Michael I. 2004. โ€œGraphical Models.โ€ Statistical Science 19 (1): 140โ€“55.
Jordan, Michael I., Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. 1999. โ€œAn Introduction to Variational Methods for Graphical Models.โ€ Machine Learning 37 (2): 183โ€“233.
Jordan, Michael Irwin. 1999. Learning in Graphical Models. Cambridge, Mass.: MIT Press.
Jordan, Michael I., and Yair Weiss. 2002a. โ€œGraphical Models: Probabilistic Inference.โ€ The Handbook of Brain Theory and Neural Networks, 490โ€“96.
โ€”โ€”โ€”. 2002b. โ€œProbabilistic Inference in Graphical Models.โ€ Handbook of Neural Networks and Brain Theory.
Kalisch, Markus, and Peter Bรผhlmann. 2007. โ€œEstimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm.โ€ Journal of Machine Learning Research 8 (May): 613โ€“36.
Kindermann, Ross P., and J. Laurie Snell. 1980. โ€œOn the Relation Between Markov Random Fields and Social Networks.โ€ The Journal of Mathematical Sociology 7 (1): 1โ€“13.
Kindermann, Ross, and J. Laurie Snell. 1980. Markov Random Fields and Their Applications. Vol. 1. Contemporary Mathematics. Providence, Rhode Island: American Mathematical Society.
Kjรฆrulff, Uffe B., and Anders L. Madsen. 2008. Bayesian Networks and Influence Diagrams. Information Science and Statistics. New York, NY: Springer New York.
Koller, Daphne, and Nir Friedman. 2009. Probabilistic Graphical Models : Principles and Techniques. Cambridge, MA: MIT Press.
Krรคmer, Nicole, Juliane Schรคfer, and Anne-Laure Boulesteix. 2009. โ€œRegularized Estimation of Large-Scale Gene Association Networks Using Graphical Gaussian Models.โ€ BMC Bioinformatics 10 (1): 384.
Krause, Andreas, and Carlos Guestrin. 2009. โ€œOptimal Value of Information in Graphical Models.โ€ J. Artif. Int. Res. 35 (1): 557โ€“91.
Kschischang, F.R., B.J. Frey, and H.-A. Loeliger. 2001. โ€œFactor Graphs and the Sum-Product Algorithm.โ€ IEEE Transactions on Information Theory 47 (2): 498โ€“519.
Lauritzen, S. L., and D. J. Spiegelhalter. 1988. โ€œLocal Computations with Probabilities on Graphical Structures and Their Application to Expert Systems.โ€ Journal of the Royal Statistical Society. Series B (Methodological) 50 (2): 157โ€“224.
Lauritzen, Steffen L. 1996. Graphical Models. Oxford Statistical Science Series. Clarendon Press.
Lavrenko, Victor, and Jeremy Pickens. 2003a. โ€œMusic Modeling with Random Fields.โ€ In Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, 389. ACM Press.
โ€”โ€”โ€”. 2003b. โ€œPolyphonic Music Modeling with Random Fields.โ€ In Proceedings of the Eleventh ACM International Conference on Multimedia, 120. ACM Press.
LeCun, Yann, Sumit Chopra, Raia Hadsell, M. Ranzato, and F. Huang. 2006. โ€œA Tutorial on Energy-Based Learning.โ€ In Predicting Structured Data.
Lederer, Johannes. 2016. โ€œGraphical Models for Discrete and Continuous Data.โ€ arXiv:1609.05551 [Math, Stat], September.
Levine, Sergey. 2018. โ€œReinforcement Learning and Control as Probabilistic Inference: Tutorial and Review.โ€ arXiv:1805.00909 [Cs, Stat], May.
Liu, Han, Fang Han, Ming Yuan, John Lafferty, and Larry Wasserman. 2012a. โ€œThe Nonparanormal SKEPTIC.โ€ arXiv:1206.6488 [Cs, Stat], June.
โ€”โ€”โ€”. 2012b. โ€œHigh-Dimensional Semiparametric Gaussian Copula Graphical Models.โ€ The Annals of Statistics 40 (4): 2293โ€“2326.
Liu, Han, Kathryn Roeder, and Larry Wasserman. 2010. โ€œStability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models.โ€ In Advances in Neural Information Processing Systems 23, edited by J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta, 1432โ€“40. Curran Associates, Inc.
Loeliger, H.-A. 2004. โ€œAn Introduction to Factor Graphs.โ€ IEEE Signal Processing Magazine 21 (1): 28โ€“41.
Maathuis, Marloes H., and Diego Colombo. 2013. โ€œA Generalized Backdoor Criterion.โ€ arXiv Preprint arXiv:1307.5636.
Maddage, Namunu C., Haizhou Li, and Mohan S. Kankanhalli. 2006. โ€œMusic Structure Based Vector Space Retrieval.โ€ In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 67. ACM Press.
Malioutov, Dmitry M., Jason K. Johnson, and Alan S. Willsky. 2006. โ€œWalk-Sums and Belief Propagation in Gaussian Graphical Models.โ€ Journal of Machine Learning Research 7 (October): 2031โ€“64.
Mao, Yongyi, Frank R. Kschischang, and Brendan J. Frey. 2004. โ€œConvolutional Factor Graphs As Probabilistic Models.โ€ In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, 374โ€“81. UAI โ€™04. Arlington, Virginia, United States: AUAI Press.
Marbach, Daniel, Robert J. Prill, Thomas Schaffter, Claudio Mattiussi, Dario Floreano, and Gustavo Stolovitzky. 2010. โ€œRevealing Strengths and Weaknesses of Methods for Gene Network Inference.โ€ Proceedings of the National Academy of Sciences 107 (14): 6286โ€“91.
McCallum, Andrew. 2012. โ€œEfficiently Inducing Features of Conditional Random Fields.โ€ arXiv:1212.2504 [Cs, Stat], October.
Meinshausen, Nicolai, and Peter Bรผhlmann. 2006. โ€œHigh-Dimensional Graphs and Variable Selection with the Lasso.โ€ The Annals of Statistics 34 (3): 1436โ€“62.
Mihalkova, Lilyana, and Raymond J. Mooney. 2007. โ€œBottom-up Learning of Markov Logic Network Structure.โ€ In Proceedings of the 24th International Conference on Machine Learning, 625โ€“32. ACM.
Mohan, Karthika, and Judea Pearl. 2018. โ€œConsistent Estimation Given Missing Data.โ€ In International Conference on Probabilistic Graphical Models, 284โ€“95.
Montanari, Andrea. 2011. โ€œLecture Notes for Stat 375 Inference in Graphical Models.โ€
Morgan, Jonathan Scott, Iman Barjasteh, Cliff Lampe, and Hayder Radha. 2014. โ€œThe Entropy of Attention and Popularity in Youtube Videos.โ€ arXiv:1412.1185 [Physics], December.
Murphy, Kevin P. 2012. Machine learning: a probabilistic perspective. 1 edition. Adaptive computation and machine learning series. Cambridge, MA: MIT Press.
Obermeyer, Fritz, Eli Bingham, Martin Jankowiak, Du Phan, and Jonathan P. Chen. 2020. โ€œFunctional Tensors for Probabilistic Programming.โ€ arXiv:1910.10775 [Cs, Stat], March.
Osokin, A., D. Vetrov, and V. Kolmogorov. 2011. โ€œSubmodular Decomposition Framework for Inference in Associative Markov Networks with Global Constraints.โ€ In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1889โ€“96.
Pearl, Judea. 1982. โ€œReverend Bayes on Inference Engines: A Distributed Hierarchical Approach.โ€ In Proceedings of the Second AAAI Conference on Artificial Intelligence, 133โ€“36. AAAIโ€™82. Pittsburgh, Pennsylvania: AAAI Press.
โ€”โ€”โ€”. 1986. โ€œFusion, Propagation, and Structuring in Belief Networks.โ€ Artificial Intelligence 29 (3): 241โ€“88.
โ€”โ€”โ€”. 2008. Probabilistic reasoning in intelligent systems: networks of plausible inference. Rev.ย 2. print., 12. [Dr.]. The Morgan Kaufmann series in representation and reasoning. San Francisco, Calif: Kaufmann.
โ€”โ€”โ€”. 2009. Causality: Models, Reasoning and Inference. Cambridge University Press.
Pearl, Judea, Dan Geiger, and Thomas Verma. 1989. โ€œConditional Independence and Its Representations.โ€ Kybernetika 25 (7): 33โ€“44.
Pereda, E, R Q Quiroga, and J Bhattacharya. 2005. โ€œNonlinear Multivariate Analysis of Neurophysiological Signals.โ€ Progress in Neurobiology 77 (1-2): 1โ€“37.
Pickens, Jeremy, and Costas S. Iliopoulos. 2005. โ€œMarkov Random Fields and Maximum Entropy Modeling for Music Information Retrieval.โ€ In ISMIR, 207โ€“14. Citeseer.
Pollard, Dave. 2004. โ€œHammersley-Clifford Theorem for Markov Random Fields.โ€
Rabbat, Michael G., Mรrio A. T. Figueiredo, and Robert D. Nowak. 2008. โ€œNetwork Inference from Co-Occurrences.โ€ IEEE Transactions on Information Theory 54 (9): 4053โ€“68.
Ranzato, M. 2013. โ€œModeling Natural Images Using Gated MRFs.โ€ IEEE Transactions on Pattern Analysis and Machine Intelligence 35 (9): 2206โ€“22.
Ravikumar, Pradeep, Martin J. Wainwright, and John D. Lafferty. 2010. โ€œHigh-Dimensional Ising Model Selection Using โ„“1-Regularized Logistic Regression.โ€ The Annals of Statistics 38 (3): 1287โ€“1319.
Reeves, R., and A. N. Pettitt. 2004. โ€œEfficient Recursions for General Factorisable Models.โ€ Biometrika 91 (3): 751โ€“57.
Richardson, Matthew, and Pedro Domingos. 2006. โ€œMarkov Logic Networks.โ€ Machine Learning 62 (1-2): 107โ€“36.
Ripley, B. D., and F. P. Kelly. 1977. โ€œMarkov Point Processes.โ€ Journal of the London Mathematical Society s2-15 (1): 188โ€“92.
Sadeghi, Kayvan. 2020. โ€œOn Finite Exchangeability and Conditional Independence.โ€ Electronic Journal of Statistics 14 (2): 2773โ€“97.
Schmidt, Mark W., and Kevin P. Murphy. 2010. โ€œConvex Structure Learning in Log-Linear Models: Beyond Pairwise Potentials.โ€ In International Conference on Artificial Intelligence and Statistics, 709โ€“16.
Shachter, Ross D. 1998. โ€œBayes-Ball: Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams).โ€ In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, 480โ€“87. UAIโ€™98. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.
Shalizi, Cosma Rohilla, and Edward McFowland III. 2016. โ€œControlling for Latent Homophily in Social Networks Through Inferring Latent Locations.โ€ arXiv:1607.06565 [Physics, Stat], July.
Smith, David A., and Jason Eisner. 2008. โ€œDependency Parsing by Belief Propagation.โ€ In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 145โ€“56. Association for Computational Linguistics.
Spirtes, Peter, Clark Glymour, and Richard Scheines. 2001. Causation, Prediction, and Search. Second Edition. Adaptive Computation and Machine Learning. The MIT Press.
Studenรฝ, Milan. 1997. โ€œA Recovery Algorithm for Chain Graphs.โ€ International Journal of Approximate Reasoning, Uncertainty in AI (UAIโ€™96) Conference, 17 (2โ€“3): 265โ€“93.
โ€”โ€”โ€”. 2005. Probabilistic Conditional Independence Structures. Information Science and Statistics. London: Springer.
Studenรฝ, Milan, and Jiล™ina Vejnarovรก. 1998. โ€œOn Multiinformation Function as a Tool for Measuring Stochastic Dependence.โ€ In Learning in Graphical Models, 261โ€“97. Cambridge, Mass.: MIT Press.
Su, Ri-Qi, Wen-Xu Wang, and Ying-Cheng Lai. 2012. โ€œDetecting Hidden Nodes in Complex Networks from Time Series.โ€ Phys. Rev.ย E 85 (6): 065201.
Sutton, Charles, and Andrew McCallum. 2010. โ€œAn Introduction to Conditional Random Fields.โ€ arXiv:1011.4088, November.
Tansey, Wesley, Oscar Hernan Madrid Padilla, Arun Sai Suggala, and Pradeep Ravikumar. 2015. โ€œVector-Space Markov Random Fields via Exponential Families.โ€ In Journal of Machine Learning Research, 684โ€“92.
Vetrov, Dmitry, and Anton Osokin. 2011. โ€œGraph Preserving Label Decomposition in Discrete MRFs with Selfish Potentials.โ€ In NIPS Workshop on Discrete Optimization in Machine Learning (DISCML NIPS).
Visweswaran, Shyam, and Gregory F. Cooper. 2014. โ€œCounting Markov Blanket Structures.โ€ arXiv:1407.2483 [Cs, Stat], July.
Wainwright, Martin J., and Michael I. Jordan. 2008. Graphical Models, Exponential Families, and Variational Inference. Vol. 1. Foundations and Trendsยฎ in Machine Learning. Now Publishers.
Wainwright, Martin, and Michael I Jordan. 2005. โ€œA Variational Principle for Graphical Models.โ€ In New Directions in Statistical Signal Processing. Vol. 155. MIT Press.
Wang, Chaohui, Nikos Komodakis, and Nikos Paragios. 2013. โ€œMarkov Random Field Modeling, Inference & Learning in Computer Vision & Image Understanding: A Survey.โ€ Computer Vision and Image Understanding 117 (11): 1610โ€“27.
Wasserman, Larry, Mladen Kolar, and Alessandro Rinaldo. 2013. โ€œEstimating Undirected Graphs Under Weak Assumptions.โ€ arXiv:1309.6933 [Cs, Math, Stat], September.
Weiss, Yair. 2000. โ€œCorrectness of Local Probability Propagation in Graphical Models with Loops.โ€ Neural Computation 12 (1): 1โ€“41.
Weiss, Yair, and William T. Freeman. 2001. โ€œCorrectness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology.โ€ Neural Computation 13 (10): 2173โ€“2200.
Winn, John M., and Christopher M. Bishop. 2005. โ€œVariational Message Passing.โ€ In Journal of Machine Learning Research, 661โ€“94.
Wright, Sewall. 1934. โ€œThe Method of Path Coefficients.โ€ The Annals of Mathematical Statistics 5 (3): 161โ€“215.
Wu, Rui, R. Srikant, and Jian Ni. 2013. โ€œLearning Loosely Connected Markov Random Fields.โ€ Stochastic Systems 3 (2): 362โ€“404.
Yedidia, Jonathan S., W.T. Freeman, and Y. Weiss. 2005. โ€œConstructing Free-Energy Approximations and Generalized Belief Propagation Algorithms.โ€ IEEE Transactions on Information Theory 51 (7): 2282โ€“312.
Yedidia, J.S., W.T. Freeman, and Y. Weiss. 2003. โ€œUnderstanding Belief Propagation and Its Generalizations.โ€ In Exploring Artificial Intelligence in the New Millennium, edited by G. Lakemeyer and B. Nebel, 239โ€“36. Morgan Kaufmann Publishers.
Zhang, Kun, Jonas Peters, Dominik Janzing, and Bernhard Schรถlkopf. 2012. โ€œKernel-Based Conditional Independence Test and Application in Causal Discovery.โ€ arXiv:1202.3775 [Cs, Stat], February.
Zhou, Mingyuan, Yulai Cong, and Bo Chen. 2017. โ€œAugmentable Gamma Belief Networks,โ€ 44.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.