Diagramming and visualising graphical models

My need for this is conditionally dependent upon my deadline, given the subject matter


On the art and science of algorithmic line drawings for representing graphical models, which is a very important part of statistics. The diagrams we need here are nearly flowchart-like, so I can sketch them with a flowchart if need be; but they are closely integrated with the equations of a particular statistical model, so I would like to incorporate them into the same system to avoid tedious and error-prone manual sync. Further, there are are couple of things I would like handled which flowchart programs are bad ad, such as plate-notation and handling of inline mathematical markup.

As always I would like to export the resulting diagrams to a modern compatible vector format which means SVG, PDF or as a fallback one of these other formats that can be converted to the above to, such as Adobe Illustrator, EPS or xfig.

Daggity

daggity (Textor et al. 2017) is an option for the browser and also r. I’m not 100% sure how these two parts relate to each other. Documentation for the R version is hard to find. It is here.

ggdag extends daggity to the ggplot ecosystem. The ggdag bias structure vignette shows of the useful explanation diagrams available in ggdag and is also a good introduction to selection bias and causal dags themselves

Shinydag provides a web interface.

Shinydag is tedious to install natively, especially with the poor documentation. However, it runs ok in docker.

docker run -d -p 3838:3838 --name shinydag gerkelab/shinydag:latest

Now Shinydag is waiting for you at 127.0.0.1:3838/shinyDAG. It is bit crashy and clunky. I’m not sure I prefer it to plain ggdag.

dagR

dagR does R

After initializing a new DAG using a command line, the researcher can evaluate what associations are introduced by adjusting for covariables. Potentially biasing paths from exposure to outcome can be identified (see eFig. 1, demonstrating harmful adjustment using an example DAG from Fleischer and Diez Roux). Functions to conveniently add or remove nodes and arcs are included, as is a function checking introduced associations and biasing paths for all possible adjustment sets […] The graphics capabilities of R allow fairly straightforward programming of basic DAG drawing routines, while also supporting the interactive repositioning of nodes and arcs.

It additionally calculates adjustment sets and other useful functions of the graph.

yEd

yEd is a low-key nerview diagrammer.

yEd supports a wide variety of diagram types. In addition to the illustrated types, (BPMN Diagrams, Flowcharts, Family Trees, Semantic Networks, Social Networks, UML Class Diagrams) yEd also supports organization charts, mind maps, swimlane diagrams, Entity Relationship diagrams, and many more.

Found via A blog by Jonas Kristoffer Lindeløv which talks through pluses and minuses:

yEd is purely graphical editing which is fast to work with and great for tweaking small details. A very handy yEd feature is its intelligent snapping to alignments and equal distances when positioning objects. Actually, I don’t understand why yEd almost never makes it to the “top 10 diagram/flowchart programs” lists.

A few things I learned: To make subscripts, you have to use HTML code.… it is not possible to do double-subscripts. Also, the double-edges node is made completely manually by placing an empty ellipse above another. I did not manage to align the 𝑊𝑀𝐶𝑖 label a bit lower in the node. A final limitation is that arrowhead sizes cannot be changed. You can, however, zoom. Therefore, your very first decision has to be the arrowhead size. Zoom so that it is appropriate and make your graphical model..

diagrammeR

DiagrammerR is a generic graph visualisation app for R which can incidentally do graphical models. probably the most visually attractive option here, but it is blind to the specific needs of graphical models.

Mermaid

Another browser option. mermaid is a flowcharting tool which can be pressed into service for DAGs. USP: code-driven diagrams with a syntax that aspires to be so basic that it is easier than point and click. Integrates with many markdown editors. Has an online editor, and a CLI.

TETRAD

TETRAD is less for sketching idealised DAGs than for visualising and calculating giant empirical DAGs. It’s written by eminent causality inference people.

Tetrad is a program which creates, simulates data from, estimates, tests, predicts with, and searches for causal and statistical models. The aim of the program is to provide sophisticated methods in a friendly interface requiring very little statistical sophistication of the user and no programming knowledge. It is not intended to replace flexible statistical programming systems such as Matlab, Splus or R. Tetrad is freeware that performs many of the functions in commercial programs such as Netica, Hugin, LISREL, EQS and other programs, and many discovery functions these commercial programs do not perform.

Tetrad is limited to models of categorical data (which can also be used for ordinal data) and to linear [sic] models (“structural equation models”) with a Normal probability distribution, and to a very limited class of time series models. The Tetrad programs describe causal models in three distinct parts or stages: a picture, representing a directed graph specifying hypothetical causal relations among the variables; a specification of the family of probability distributions and kinds of parameters associated with the graphical model; and a specification of the numerical values of those parameters.

NB Structural Equation models are not required to be linear. Weird phrasing.

Matplotlib

HOWTO diagramming convnets using matplotlib in python Or: daft-pgm:

Daft is a Python package that uses matplotlib to render pixel-perfect probabilistic graphical models for publication in a journal or on the internet. With a short Python script and an intuitive model-building syntax you can design directed (Bayesian Networks, directed acyclic graphs) and undirected (Markov random fields) models and save them in any formats that matplotlib supports (including PDF, PNG, EPS and SVG).

Graphviz

graphviz is nearly good, in that it supports graphing arbitrary networks, including DAGs. Inevitably, none of its fancy algorithms ever lay it out quite like I want. There is a macOS gui and a cross-platform (WX) gui called doteditor.

However, I would recommend dagitty over this; the syntax is nearly the same, but dagitty additionally accepts specification by a structural equation model.

tikz

Laura Dietz and Jaakko Luttinen made tikz macros for darwing Bayes nets in LaTeX, tikz-bayesnet

Breitling, Lutz Philipp. 2010. “dagR: A Suite of R Functions for Directed Acyclic Graphs.” Epidemiology 21 (4): 586. https://doi.org/10.1097/EDE.0b013e3181e09112.

Creed, Jordan, and Travis Gerke. 2018. Gerkelab/Shinydag: Initial Release. Zenodo. https://doi.org/10.5281/ZENODO.1288712.

Greenland, Sander, Judea Pearl, and James M Robins. 1999. “Causal Diagrams for Epidemiologic Research.” Epidemiology 10 (1): 37. https://journals.lww.com/epidem/Abstract/1999/01000/Causal_Diagrams_for_Epidemiologic_Research.8.aspx.

Textor, Johannes, Juliane Hardt, and Sven Knüppel. 2011. “DAGitty: A Graphical Tool for Analyzing Causal Diagrams.” Epidemiology 22 (5): 745. https://doi.org/10.1097/EDE.0b013e318225c2be.

Textor, Johannes, Benito van der Zander, Mark S. Gilthorpe, Maciej Liśkiewicz, and George T. H. Ellison. 2017. “Robust Causal Inference Using Directed Acyclic Graphs: The R Package ‘Dagitty’.” International Journal of Epidemiology, January, dyw341. https://doi.org/10.1093/ije/dyw341.