Hierarchical models

DAGs, multilevel models, random coefficient models, mixed effect models, structural equation models…

June 7, 2015 — April 21, 2022

hidden variables
hierarchical models
machine learning
meta learning
Figure 1

The classical regression set up: My process of interest generates observations conditional on certain predictors. The observations (but not predictors) are corrupted by noise.

Hierarchical set up: There is a directed graph of interacting random processes, generating the data, and I would like to reconstruct the parameters, possibly even conditional distributions of parameters, accounting for interactions, when only some of these are observed.

Studied as mixed effects models, hierarchical models, nested models (careful! many definitions to that term), random coefficient models, error-in-variables models, structural equation models

Directed graphical models provide the formalism. When we mention graphical models, frequently the emphasis is on the independence graph itself. In a hierarchical model context we more frequently wish to estimate parameters, or sample from posteriors or what-have-you.

In the case that you have many layers of hidden variables and don’t expect any of them to correspond to a “real” state so much as simply to approximate the unknown function better, you just discovered a deep neural network, possibly even a probabilistic neural network. (Ranzato 2013) (for example) does explicitly discusses them in this way.

Thomas Wiecki wrote:

Some of Andrew Gelman’s blog posts on hierarchical models provide helpful context (1, 2, 3).

1 Interesting special cases

In certain cute cases (i.e. linear, homoskedastic) these problems become deconvolution. (🏗 explain what I mean here and why I bothered to say it.) See ANOVA for an important special case. More generally, we sometimes find it convenient to use hierarchical generalised linear models, which have all manner of nice properties for inference.

2 Cluster randomized trials

Melanie Bell, Cluster Randomized Trials

Cluster randomized trials (CRTs) are studies where groups of people, rather than individuals, are randomly allocated to intervention or control. While these type of designs can be appropriate and useful for many research settings, care must be taken to correctly design and analyze them. This talk will give an overview of cluster trials, and various methodological research projects on cluster trials that I’ve been undertaken: designing CRTs, the use of GEE with small number of clusters, handling missing data in CRTs, and analysis using mixed models.

3 Teaching

See this nice animated demonstration.

4 Implementations

4.1 Lavaan

The lavaan Project:

The lavaan package is developed to provide useRs, researchers and teachers a free open-source, but commercial-quality package for latent variable modeling. You can use lavaan to estimate a large variety of multivariate statistical models, including path analysis, confirmatory factor analysis, structural equation modeling and growth curve models.

The official reference to the lavaan package is Rosseel (2012)

5 Semopy

semopy: Structural Equation Modeling in Python

6 Any modern Bayes toolkit

Essentially all of modern Bayesian statistics implements these methods; For more on that, see probabilistic programming. As a default though, start with stan.

7 References

Blackwell, Honaker, and King. 2015. A Unified Approach to Measurement Error and Missing Data: Details and Extensions.” Sociological Methods & Research.
Bolker, Brooks, Clark, et al. 2009. Generalized Linear Mixed Models: A Practical Guide for Ecology and Evolution.” Trends in Ecology & Evolution.
Breslow, and Clayton. 1993. Approximate Inference in Generalized Linear Mixed Models.” Journal of the American Statistical Association.
Bürkner. 2018. Advanced Bayesian Multilevel Modeling with the R Package Brms.” The R Journal.
Chan, Lu, and Yau. 2016. Factor Modelling for High-Dimensional Time Series: Inference and Model Selection.” Journal of Time Series Analysis.
DiTraglia, Garcia-Jimeno, O’Keeffe-O’Donovan, et al. 2020. Identifying Causal Effects in Experiments with Social Interactions and Non-Compliance.” arXiv:2011.07051 [Econ, Stat].
Efron. 2009. Empirical Bayes Estimates for Large-Scale Prediction Problems.” Journal of the American Statistical Association.
Gelman. 2006. Multilevel (Hierarchical) Modeling: What It Can and Cannot Do.” Technometrics.
Gelman, Hill, and Vehtari. 2021. Regression and other stories.
Gelman, Lee, and Guo. 2015. Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization.” Journal of Educational and Behavioral Statistics.
Hansen. 2007. Generalized Least Squares Inference in Panel and Multilevel Models with Serial Correlation and Fixed Effects.” Journal of Econometrics.
Koren, Bell, and Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems.” Computer.
Lee, and Nelder. 2001. Hierarchical Generalised Linear Models: A Synthesis of Generalised Linear Models, Random-Effect Models and Structured Dispersions.” Biometrika.
———. 2006. Double Hierarchical Generalized Linear Models (with Discussion).” Journal of the Royal Statistical Society: Series C (Applied Statistics).
Li, and Mykland. 2007. Are Volatility Estimators Robust with Respect to Modeling Assumptions? Bernoulli.
Mallet. 1986. A Maximum Likelihood Estimation Method for Random Coefficient Regression Models.” Biometrika.
McElreath, and Boyd. 2007. Mathematical Models of Social Evolution: A Guide for the Perplexed.
Miller. 2013. The Chicago Guide to Writing about Multivariate Analysis. Chicago Guides to Writing, Editing, and Publishing.
Ranzato. 2013. Modeling Natural Images Using Gated MRFs.” IEEE Transactions on Pattern Analysis and Machine Intelligence.
Reiersol. 1950. Identifiability of a Linear Relation Between Variables Which Are Subject to Error.” Econometrica.
Rosseel. 2012. Lavaan: An R Package for Structural Equation Modeling.” Journal of Statistical Software.
Saefken, Kneib, Waveren, et al. 2014. A Unifying Approach to the Estimation of the Conditional Akaike Information in Generalized Linear Mixed Models.” Electronic Journal of Statistics.
Valpine. 2011. Frequentist Analysis of Hierarchical Models for Population Dynamics and Demographic Data.” Journal of Ornithology.
Venables, and Dichmont. 2004. GLMs, GAMs and GLMMs: An Overview of Theory for Applications in Fisheries Research.” Fisheries Research, Models in Fisheries Research: GLMs, GAMS and GLMMs,.