Learning graphical models from time series

Also, causal discovery, structure discovery in time series

2017-09-19 — 2023-11-28

Wherein causal links among multivariate time series are sought by statistical discovery, attention being given to linear Granger tests, PCMCI variants, and practical implementation via the Tigramite software.

algebra

graphical models

machine learning

networks

probability

statistics

Suppose I measure a time series of vectors, and I think that some of the components of the vector cause other components of the vector, but I have no idea which causes which. Then I might want to do graphical model discovery from time series and find out which components cause which other components.

Does this setup feel contrived to you? It does to me; I have never found a use for it. The setup looks provocative (“learn the world from observations!”) but in practice AFAICT we do not usually wish to do it this way. We never have a phenomenon that we are capable of collecting a huge amount of data about but about which we have no ideas regarding the causal structure. If it is a system that is so complicated that the causal structure might be muddy (educational attainments in children, the weather systems) then it is not so much that we have no idea what causes what, but rather that we cannot measure the causal factors of interest, because of questions of scale or difficulty, but we might hope that somehow some aggregate quantities of interest are measurable…? I guess that is why we might think this way, but it seems to lead to bizarre and contrived answers for ill-posed questions about convenience samples (“Does education cause wealth or does wealth cause education? Does moisture cause rain or does rain cause moisture?”).

It seems popular though, and some smart people have spent time on it, so maybe there is something interesting there.

If we are really convinced that there are some latent parameters of interest but that they are not measurable, we might want to think about causality between observables and observables, which sounds very much like a koopman operator setup. Probably someone has made that work, but I have not seen it yet.

1 Granger causality

a.k.a linear-Gaussian “causality”. Requires mentioning, usually not what is required. Generalised form that is probably also not what is required: transfer entropy.

2 PCMCI

Not sure yet. Looks popular.

3 Tooling

3.1 Tigramite

Tigramite / source

Tigramite is a causal time series analysis python package. It allows to efficiently estimate causal graphs from high-dimensional time series datasets (causal discovery) and to use these graphs for robust forecasting and the estimation and prediction of direct, total, and mediated effects. Causal discovery is based on linear as well as non-parametric conditional independence tests applicable to discrete or continuously-valued time series

4 References

Alquier, and Wintenberger. 2012. “Model Selection for Weakly Dependent Time Series Forecasting.” Bernoulli.

Bacry, and Muzy. 2016. “First- and Second-Order Statistics Characterization of Hawkes Processes and Non-Parametric Estimation.” IEEE Transactions on Information Theory.

Barnett, Barrett, and Seth. 2009. “Granger Causality and Transfer Entropy Are Equivalent for Gaussian Variables.” Physical Review Letters.

Barrett, Barnett, and Seth. 2010. “Multivariate Granger Causality and Generalized Variance.” Phys. Rev. E.

Brodersen, Gallusser, Koehler, et al. 2015. “Inferring Causal Impact Using Bayesian Structural Time-Series Models.” The Annals of Applied Statistics.

Eichler. 2001. “Granger-Causality Graphs for Multivariate Time Series.” Granger-Causality Graphs for Multivariate Time Series.

———. 2007. “Granger Causality and Path Diagrams for Multivariate Time Series.” Journal of Econometrics.

Gade, and Rodu. 2023. “Nonlinear Permuted Granger Causality.”

Gerhardus, and Runge. 2020. “High-Recall Causal Discovery for Autocorrelated Time Series with Latent Confounders.” In Advances in Neural Information Processing Systems.

Krich, Runge, Miralles, et al. 2020. “Estimating causal networks in biosphere–atmosphere interaction with the PCMCI approach.” Biogeosciences.

Runge, Jakob. 2015. “Quantifying Information Transfer and Mediation Along Causal Pathways in Complex Systems.” Physical Review E.

———. 2018. “Conditional Independence Testing Based on a Nearest-Neighbor Estimator of Conditional Mutual Information.” In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics.

Runge, J. 2018. “Causal Network Reconstruction from Time Series: From Theoretical Assumptions to Practical Estimation.” Chaos: An Interdisciplinary Journal of Nonlinear Science.

Runge, Jakob. 2021. “Necessary and Sufficient Graphical Conditions for Optimal Adjustment Sets in Causal Graphical Models with Hidden Variables.” In Advances in Neural Information Processing Systems.

———. 2022. “Discovering Contemporaneous and Lagged Causal Relations in Autocorrelated Nonlinear Time Series Datasets.”

Runge, Jakob, Bathiany, Bollt, et al. 2019. “Inferring Causation from Time Series in Earth System Sciences.” Nature Communications.

Runge, Jakob, Donner, and Kurths. 2015. “Optimal Model-Free Prediction from Multivariate Time Series.” Physical Review E.

Runge, Jakob, Gerhardus, Varando, et al. 2023. “Causal Inference for Time Series.” Nature Reviews Earth & Environment.

Runge, Jakob, Nowack, Kretschmer, et al. 2019. “Detecting and Quantifying Causal Associations in Large Nonlinear Time Series Datasets.” Science Advances.

Runge, Jakob, Nowack, Kretschmer, et al. n.d. “Detecting Causal Associations in Large Nonlinear Time Series Datasets.”

Runge, Jakob, Petoukhov, Donges, et al. 2015. “Identifying Causal Gateways and Mediators in Complex Spatio-Temporal Systems.” Nature Communications.

Saggioro, de Wiljes, Kretschmer, et al. 2020. “Reconstructing Regime-Dependent Causal Relationships from Observational Time Series.” Chaos: An Interdisciplinary Journal of Nonlinear Science.

Sugihara, May, Ye, et al. 2012. “Detecting Causality in Complex Ecosystems.” Science.