On the art and science of algorithmic line drawings for representing graphical models, which is a very important part of statistics. The diagrams we need here are nearly flowchart-like, so I can sketch them with a flowchart if need be; but they are closely integrated with the equations of a particular statistical model, so I would like to incorporate them into the same system to avoid tedious and error-prone manual sync. Further, there are are couple of things I would like handled which flowchart programs are bad ad, such as plate-notation and handling of inline mathematical markup.
As always I would like to export the resulting diagrams to a modern compatible vector format which means SVG, PDF or as a fallback one of these other formats that can be converted to the above to, such as Adobe Illustrator, EPS or xfig.
daggity to the
The ggdag bias structure vignette
shows of the useful explanation diagrams available in
ggdag and is also a good introduction to selection bias and causal dags themselves
Shinydag provides a web interface.
Shinydag is tedious to install natively, especially with the poor documentation. However, it runs ok in docker.
docker run -d -p 3838:3838 --name shinydag gerkelab/shinydag:latest
Now Shinydag is waiting for you at
It is bit crashy and clunky. I’m not sure I prefer it to plain ggdag.
After initializing a new DAG using a command line, the researcher can evaluate what associations are introduced by adjusting for covariables. Potentially biasing paths from exposure to outcome can be identified (see eFig. 1, demonstrating harmful adjustment using an example DAG from Fleischer and Diez Roux). Functions to conveniently add or remove nodes and arcs are included, as is a function checking introduced associations and biasing paths for all possible adjustment sets […] The graphics capabilities of R allow fairly straightforward programming of basic DAG drawing routines, while also supporting the interactive repositioning of nodes and arcs.
It additionally calculates adjustment sets and other useful functions of the graph.
yEd is a low-key nerview diagrammer.
yEd supports a wide variety of diagram types. In addition to the illustrated types, (BPMN Diagrams, Flowcharts, Family Trees, Semantic Networks, Social Networks, UML Class Diagrams) yEd also supports organization charts, mind maps, swimlane diagrams, Entity Relationship diagrams, and many more.
Found via A blog by Jonas Kristoffer Lindeløv which talks through pluses and minuses:
yEd is purely graphical editing which is fast to work with and great for tweaking small details. A very handy yEd feature is its intelligent snapping to alignments and equal distances when positioning objects. Actually, I don’t understand why yEd almost never makes it to the “top 10 diagram/flowchart programs” lists.
A few things I learned: To make subscripts, you have to use HTML code.… it is not possible to do double-subscripts. Also, the double-edges node is made completely manually by placing an empty ellipse above another. I did not manage to align the 𝑊𝑀𝐶𝑖 label a bit lower in the node. A final limitation is that arrowhead sizes cannot be changed. You can, however, zoom. Therefore, your very first decision has to be the arrowhead size. Zoom so that it is appropriate and make your graphical model..
DiagrammerR is a generic graph visualisation app for R which can incidentally do graphical models. probably the most visually attractive option here, but it is blind to the specific needs of graphical models.
Another browser option. mermaid is a flowcharting tool which can be pressed into service for DAGs. USP: code-driven diagrams with a syntax that aspires to be so basic that it is easier than point and click. Integrates with many markdown editors. Has an online editor, and a CLI.
TETRAD is less for sketching idealised DAGs than for visualising and calculating giant empirical DAGs. It’s written by eminent causality inference people.
Tetrad is a program which creates, simulates data from, estimates, tests, predicts with, and searches for causal and statistical models. The aim of the program is to provide sophisticated methods in a friendly interface requiring very little statistical sophistication of the user and no programming knowledge. It is not intended to replace flexible statistical programming systems such as Matlab, Splus or R. Tetrad is freeware that performs many of the functions in commercial programs such as Netica, Hugin, LISREL, EQS and other programs, and many discovery functions these commercial programs do not perform.
Tetrad is limited to models of categorical data (which can also be used for ordinal data) and to linear [sic] models (“structural equation models”) with a Normal probability distribution, and to a very limited class of time series models. The Tetrad programs describe causal models in three distinct parts or stages: a picture, representing a directed graph specifying hypothetical causal relations among the variables; a specification of the family of probability distributions and kinds of parameters associated with the graphical model; and a specification of the numerical values of those parameters.
NB Structural Equation models are not required to be linear. Weird phrasing.
Daft is a Python package that uses matplotlib to render pixel-perfect probabilistic graphical models for publication in a journal or on the internet. With a short Python script and an intuitive model-building syntax you can design directed (Bayesian Networks, directed acyclic graphs) and undirected (Markov random fields) models and save them in any formats that matplotlib supports (including PDF, PNG, EPS and SVG).
graphviz is nearly good, in that it supports graphing arbitrary networks, including DAGs. Inevitably, none of its fancy algorithms ever lay it out quite like I want. There is a macOS gui and a cross-platform (WX) gui called doteditor.
- there is a python graphviz wrapper
- Aaaaaand it renders in jupyter. (see also other jupyter options)
- and traditional style using Rgraphviz
- You probably want it to work mathematically; there is a TeX backend, called dot2tex.
However, I would recommend dagitty over this; the syntax is nearly the same, but dagitty additionally accepts specification by a structural equation model.
Laura Dietz and Jaakko Luttinen made tikz macros for darwing Bayes nets in LaTeX, tikz-bayesnet
Breitling, Lutz Philipp. 2010. “dagR: A Suite of R Functions for Directed Acyclic Graphs.” Epidemiology 21 (4): 586. https://doi.org/10.1097/EDE.0b013e3181e09112.
Creed, Jordan, and Travis Gerke. 2018. Gerkelab/Shinydag: Initial Release. Zenodo. https://doi.org/10.5281/ZENODO.1288712.
Greenland, Sander, Judea Pearl, and James M Robins. 1999. “Causal Diagrams for Epidemiologic Research.” Epidemiology 10 (1): 37. https://journals.lww.com/epidem/Abstract/1999/01000/Causal_Diagrams_for_Epidemiologic_Research.8.aspx.
Textor, Johannes, Juliane Hardt, and Sven Knüppel. 2011. “DAGitty: A Graphical Tool for Analyzing Causal Diagrams.” Epidemiology 22 (5): 745. https://doi.org/10.1097/EDE.0b013e318225c2be.
Textor, Johannes, Benito van der Zander, Mark S. Gilthorpe, Maciej Liśkiewicz, and George T. H. Ellison. 2017. “Robust Causal Inference Using Directed Acyclic Graphs: The R Package ‘Dagitty’.” International Journal of Epidemiology, January, dyw341. https://doi.org/10.1093/ije/dyw341.