The Matrix-Gaussian distribution
August 19, 2022 — February 16, 2023
Gupta and Nagar (1999):
The random matrix \(X(p \times n)\) is said to have a matrix variate normal distribution with mean matrix \(M(p \times n)\) and covariance matrix \(\Sigma \otimes \Psi\) where \(\Sigma(p \times p)>0\) and \(\Psi(n \times n)>0\), if \(\operatorname{vec}\left(X^{\prime}\right) \sim N_{p n}\left(\operatorname{vec}\left(M^{\prime}\right), \Sigma \otimes \Psi\right)\)
We shall use the notation \(X \sim N_{p, n}(M, \Sigma \otimes \Psi)\).
They prove the following theorem:
If \(X \sim N_{p, n}(M, \Sigma \otimes \Psi)\), then the p.d.f. of \(X\) is given by \[ \begin{array}{r} (2 \pi)^{-\frac{1}{2} n p} \operatorname{det}(\Sigma)^{-\frac{1}{2} n} \operatorname{det}(\Psi)^{-\frac{1}{2} p} \operatorname{etr}\left\{-\frac{1}{2} \Sigma^{-1}(X-M) \Psi^{-1}(X-M)^{\prime}\right\} \\ X \in \mathbb{R}^{p \times n}, M \in \mathbb{R}^{p \times n} \end{array} \]
Is this the same matrix normal as discussed in scipy.stats.matrix_normal
? If so
The probability density function for matrix_normal is \[ f(X)=(2 \pi)^{-\frac{m n}{2}}|U|^{-\frac{n}{2}}|V|^{-\frac{m}{2}} \exp \left(-\frac{1}{2} \operatorname{Tr}\left[U^{-1}(X-M) V^{-1}(X-M)^T\right]\right) \] where \(M\) is the mean, \(U\) the among-row covariance matrix, \(V\) the among-column covariance matrix. The allow_singular behaviour of the multivariate_normal distribution is not currently supported. Covariance matrices must be full rank. The matrix_normal distribution is closely related to the multivariate_normal distribution. Specifically, \(\operatorname{Vec}(X)\) (the vector formed by concatenating the columns of \(X\)) has a multivariate normal distribution with mean \(\operatorname{Vec}(M)\) and covariance \(V \otimes U\) (where \(\otimes\) is the Kronecker product). Sampling and pdf evaluation are \(\mathcal{O}\left(m^3+n^3+m^2 n+m n^2\right)\) for the matrix normal, but \(\mathcal{O}\left(m^3 n^3\right)\) for the equivalent multivariate normal, making this equivalent form algorithmically inefficient.
Looks right. For an actual introduction, the section in The Book of Statistical Proof proves some useful theorems in a consistent notation.