Matrix- and vector-valued generalizations of Gamma processes



\[\renewcommand{\var}{\operatorname{Var}} \renewcommand{\corr}{\operatorname{Corr}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\rv}[1]{\mathsf{#1}} \renewcommand{\vrv}[1]{\vv{\rv{#1}}} \renewcommand{\disteq}{\stackrel{d}{=}} \renewcommand{\gvn}{\mid} \renewcommand{\mm}[1]{\mathrm{#1}} \renewcommand{\Ex}{\mathbb{E}} \renewcommand{\Pr}{\mathbb{P}}\]

Processes that resemble Gamma processes but which take vector or matrix values.

We start by considering trivial processes which have an empty index set, i.e. multivariate gamma distributions. So here is the simplest multivariate case:

Bivariate thinned Gamma RVs

There are many possible ways of generating two correlated gamma variates, but as far as I am concerned the default is the Beta thinning method.

For this and throughout, we use the fact that \[\begin{aligned} \rv{g}&\sim\operatorname{Gamma}(a,\lambda)\\ &\Rightarrow\\ \Ex[\rv{g}]&=\frac{a}{\lambda}\\ \var[\rv{g}]&=\frac{a}{\lambda^2}\\ \Ex[\rv{g}^2]&=\frac{a^2 +a}{\lambda^2}. \end{aligned}\]

Suppose we have \(\rv{g}_1\sim\operatorname{Gamma}(a_1+a_2,\lambda)\) and \(\rv{b}\sim\operatorname{Beta}(a_1, a_2)\) and \(\rv{g}'\sim\operatorname{Gamma}(a_2,\lambda)\) with all rvs jointly independent. Now we generate \[\rv{g}_2=\rv{b}\rv{g}_1+\rv{g}'.\] Then \[\begin{aligned}\Ex[\rv{g}_2] &=\Ex[\rv{b}\rv{g}_1]+\Ex[\rv{g}']\\ &=\frac{a_1+a_2}{\lambda} \end{aligned}\] and \[\begin{aligned}\Ex[\rv{g}_1\rv{g}_2] &=\Ex[\rv{g}_1(\rv{b}\rv{g}_1+\rv{g}')]\\ &=\Ex[\rv{g}_1\rv{b}\rv{g}_1+\rv{g}_1\rv{g}')]\\ &=\Ex[\rv{g}_1^2\rv{b}]+\Ex[\rv{g}_1\rv{g}')]\\ &=\Ex[\rv{g}_1^2]\Ex[\rv{b}]+\Ex[\rv{g}_1]\Ex[\rv{g}')]\\ &=\frac{(a_1+a_2)^2 + (a_1+a_2)}{\lambda^2}\frac{a_1}{a_1+a_2}+\frac{a_1+a_2}{\lambda}\frac{a_2}{\lambda}\\ &=\frac{a_1(a_1+a_2) + a_1}{\lambda^2}+\frac{a_2(a_1+a_2)}{\lambda^2}\\ &=\frac{a_1(a_1+a_2) + a_1+ a_2(a_1+a_2)}{\lambda^2}\\ &=\frac{a_1^2+2a_1a_2 + a_1+a_2^2}{\lambda^2} \end{aligned}\] and thus \[\begin{aligned}\var[\rv{g}_1,\rv{g}_2] &=\Ex[\rv{g}_1\rv{g}_2]-\Ex[\rv{g}_1]\Ex[\rv{g}_2]\\ &=\frac{a_1^2+2a_1a_2 + a_1+a_2^2-(a_1+a_2)^2}{\lambda^2}\\ &=\frac{a_1^2+2a_1a_2 + a_1+a_2^2-a_1^2-2a_1a_2-a_2^2}{\lambda^2}\\ &=\frac{a_1}{\lambda^2} \end{aligned}\] Thence \[\begin{aligned}\corr[\rv{g}_1,\rv{g}_2] &=\frac{\var[\rv{g}_1,\rv{g}_2]}{\sqrt{\var[\rv{g}_1]\var[\rv{g}_2]}}\\ &=\frac{\frac{a_1}{\lambda^2}}{\frac{a_1+a_2}{\lambda^2}}\\ &=\frac{a_1 }{a_1+a_2}. \end{aligned}\]

We can use this to generate more than two correlated gamma variates by successively thinning them, as long as we take care to remember that we can only add independent Gamma variates together to produce a new Gamma variate in general.

Multivariate latent thinned Gamma RVs

A different construction.

Suppose we have \(K\) independent latent gamma variates. \[ \rv{\zeta}_k \sim \operatorname{Gamma}(A, \lambda), k=1,2,\dots,K. \] Let us also suppose we have a \(\mathbb{R}^{D\times K}\) latent mixing matrix, \(\mm{R}=[r_{d,k}]\) whose entries are all non-negative and whose rows sum to \(A.\) Now, let us define new random variates \[ \rv{g}_d =\sum_{k=1}^K \rv{b}_{d,k}\rv{\zeta}_k, d=1,2,\dots,D. \] where \(\rv{b}_{d,k}\sim\operatorname{Beta}(r_{d,k},A-r_{d,k}).\) By construction and the Beta thinning rule, we know that \[ \rv{g}_d \sim \operatorname{Gamma}(A, \lambda), d=1,2,\dots,D. \] What can we say about the correlation between them?

For this, we need the fact that \[\begin{aligned} \rv{b}&\sim\operatorname{Beta}(\alpha,\beta)\\ &\Rightarrow\\ \Ex[\rv{b}]&=\frac{\alpha}{\alpha+\beta}\\ \var[\rv{b}]&=\frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)}\\ \Ex[\rv{b}^2]&=\frac{\alpha^2(\alpha+\beta+1)+\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} \end{aligned}\]

Expanding out the \(\rv{g}\)s, \[\begin{aligned}\Ex[\rv{g}_1\rv{g}_2] &=\Ex\left[\left(\sum_{k=1}^K \rv{b}_{1,k}\rv{\zeta}_k\right)\left( \sum_{k=1}^K \rv{b}_{2,k}\rv{\zeta}_k\right)\right]\\ &=\Ex\left[\sum_{k=1}^K \sum_{j=1}^K \rv{b}_{1,k}\rv{\zeta}_k\rv{b}_{2,j}\rv{\zeta}_j\right]\\ &=\sum_{k=1}^K \sum_{j=1}^K \Ex\left[\rv{b}_{1,k}\rv{\zeta}_k\rv{b}_{2,j}\rv{\zeta}_j\right]\\ &=\sum_{k=1}^K \Ex\left[\rv{b}_{1,k}\rv{b}_{2,k}\rv{\zeta}_k^2\right] + 2\sum_{k=1}^K \sum_{j=k+1}^K \Ex\left[\rv{b}_{1,k}\rv{\zeta}_k\rv{b}_{2,j}\rv{\zeta}_j\right]\\ &=\sum_{k=1}^K \Ex[\rv{b}_{1,k}]\Ex[\rv{b}_{2,k}]\Ex[\rv{\zeta}_k^2] + 2\sum_{k=1}^K \sum_{j=k+1}^K \Ex[\rv{b}_{1,k}]\Ex[\rv{\zeta}_k]\Ex[\rv{b}_{2,j}]\Ex[\rv{\zeta}_j]\\ &=\frac{A^2+A}{\lambda^2 A^2}\sum_{k=1}^Kr_{1,k}r_{2,k} + 2\sum_{k=1}^K \sum_{j=k+1}^K \frac{r_{1,k}}{A} \frac{A}{\lambda} \frac{r_{2,j}}{A} \frac{A}{\lambda} \\ &=\frac{A+1}{\lambda^2 A}\sum_{k=1}^Kr_{1,k}r_{2,k} + 2\sum_{k=1}^K \sum_{j=k+1}^K \frac{r_{1,k}r_{2,j}}{\lambda^2} \\ &=\frac{1}{\lambda^2}\left(\frac{A+1}{ A}\sum_{k=1}^Kr_{1,k}r_{2,k} + 2\sum_{k=1}^K \sum_{j=k+1}^K r_{1,k}r_{2,j} \right)\\ &=\frac{1}{\lambda^2}\left(\frac{1}{ A}\sum_{k=1}^Kr_{1,k}r_{2,k} + \sum_{k=1}^K \sum_{j=1}^K r_{1,k}r_{2,j} \right)\\ &=\frac{\frac1A \vv{r}_1\cdot\vv{r}_2+A^2}{\lambda^2} \end{aligned}\]

So \[\begin{aligned}\var[\rv{g}_1,\rv{g}_2] &=\Ex[\rv{g}_1\rv{g}_2]-\Ex\rv{g}_1\Ex\rv{g}_2\\ &=\frac{ \vv{r}_1\cdot\vv{r}_2}{A\lambda^2}. \end{aligned}\] and \[\begin{aligned}\corr[\rv{g}_1,\rv{g}_2] &=\frac{\var[\rv{g}_1,\rv{g}_2]}{\sqrt{\var[\rv{g}_1]\var[\rv{g}_2]}}\\ &=\frac{\frac{ \vv{r}_1\cdot\vv{r}_2}{A\lambda^2}}{\frac{A}{\lambda^2}}\\ &=\frac{ \vv{r}_1\cdot\vv{r}_2}{A^2}. \end{aligned}\]

After all that work, we find pretty much what we would have expected for the correlation of rvs constructed by weighting of latent Gaussian factors. Was there a shortcut that could have gotten us there quicker?

(For one, we could have equivalently constructed this in terms of samples from a \(K\)-dimensional Lévy-Gamma process, but that is another story.)

Wolpert’s spatial construction

By choosing different intersections of spatial support, different correlation is available in the spatial gamma processes of in R. Wolpert and Ickstadt (1998).

Random correlated weights

I haven’t actually checked that this works for gamma marginals, but it looks similar at least. Ranganath and Blei (2018) creates random measures that I think can induce Gamma weights by imposing a Gaussian correlation structure on the random weights of a random selection of Poisson atoms.

Kingman constructions

Possibly a unifying framework for the previous two.

General multivariate Gamma distribution

How general can these be? Much effort has been spent on this question (Barndorff-Nielsen, Maejima, and Sato 2006; Barndorff-Nielsen, Pedersen, and Sato 2001; Buchmann et al. 2015; Mathai and Moschopoulos 1991; Mathal and Moschopoulos 1992; Pérez-Abreu and Stelzer 2014; Semeraro 2008; Singpurwalla and Youngren 1993).

We use the convention that the Fourier transform \(\widehat{\mu}\) of a measure \(\mu\) on \(\mathbb{M}=\mathbb{R}^{d}\) is given \[ \hat{\mu}(z)=\int_{\mathbb{M}} e^{i(z, x)} \mu(\mathrm{d} x) \quad z \in \mathbb{M} \]

With this, Pérez-Abreu and Stelzer (2014) give a (maximally?) general multivariate Gamma distribution:

Let \(\mu\) be an infinitely divisible probability distribution on \(\mathbb{R}^{d} .\) If there exists a finite measure \(\alpha\) on the unit sphere \(\mathbf{S}_{\mathbb{R}^{d},\|\cdot\|}\) with respect to the norm \(\|\cdot\|\) equipped with the Borel \(\sigma\)-algebra and a Borel-measurable function \(\lambda: \mathbf{S}_{\mathbb{R}^{d},\|\cdot\|} \rightarrow \mathbb{R}_{+}\) such that the Fourier transform of the measure is given \[ \hat{\mu}(z)=\exp \left(\int_{\mathbf{S}_{\mathbb{R}^{d},\|\cdot\|}} \int_{\mathbb{R}_{+}}\left(e^{i r v^{\top} z}-1\right) \frac{e^{-\lambda(v) r}}{r} \mathrm{~d} r \alpha(\mathrm{d} v)\right) \] for all \(z \in \mathbb{R}^{d}\), then \(\mu\) is called a d-dimensional Gamma distribution with parameters \(\alpha\) and \(\lambda\), abbreviated \(\Gamma_{d}(\alpha, \lambda)\) distribution. If \(\lambda\) is constant, we call \(\mu\) a \(\|\cdot\|\)-homogeneous \(\Gamma_{d}(\alpha, \lambda)\)-distribution.

Intriguingly the choice which norm \(\|\cdot\|\) doesn’t matter much; changing the norm just perturbs \(\lambda\) in a simple fashion.

It is not clear to me what families of \(\lambda\)s and \(alpha\)s are interesting from that description.

We do know which ones are admissible, at least:

Let \(\alpha\) be a finite measure on \(\mathbf{S}_{\mathbb{R}^{d},\|\cdot\|}\) and \(\beta: \mathbf{S}_{\mathbb{R}^{d},\|\cdot\|} \rightarrow \mathbb{R}_{+}\) a measurable function. Then [the previous construction] defines a Lévy measure \(v_{\mu}\) and thus there exists a \(\Gamma_{d}(\alpha, \beta)\) probability distribution \(\mu\) if and only if \[ \int_{\mathbf{S}_{\mathbb{R}}^{d},\|\cdot\|} \ln \left(1+\frac{1}{\beta(v)}\right) \alpha(\mathrm{d} v)<\infty . \] Moreover, \(\int_{\mathbb{R}^{d}}(\|x\| \wedge 1) v_{\mu}(\mathrm{d} x)<\infty\) holds true. The condition os trivially satisfied, if \(\beta\) is bounded away from zero \(\alpha\)-almost everywhere.

Something is weird about that integral though.

Vector Gamma process

How can we turn the a multivariate gamma distribution into a vector valued gamma process?

An associated Lévy process is easy. Are there any Ornstein-Uhlenbeck-type processes?

Ornstein-Uhlenbeck Dirichlet process

TBD

Wishart processes

Wishart distributions are commonly claimed to generalise Gamma processes, although AFAICT they are not so similar. “Wishart processes” are indeed a thing (Pfaffel 2012; Wilson and Ghahramani 2011); although the Wishart distribution is not a special case, AFAICT the same as a matrix Gamma distribution.

The common definition is not necessarily (element-wise, or in any other sense) monotonic; I am presuming we have stationary marginals? It generalises the square Bessel process, which is marginally \(\chi^2\) distributed.

Inverse Wishart

Does the Inverse Wishart Process relate? (Shah, Wilson, and Ghahramani 2014; Tracey and Wolpert 2018) TODO

HDP Matrix Gamma Process

Matrix-valued Lévy-Gamma process analogue. See (Meier, Kirch, and Meyer 2020, sec. 2), which uses the multivariate construction of Pérez-Abreu and Stelzer (2014). That construction is an extremely general, and somewhat abstract, and is easy to handle usually only through its Lévy measure.

AΓ Process

Meier, Kirch, and Meyer (2020) mentions a construction less general than the HDP Matrix Gamma which is nonetheless useful construction:

A special case of the \(\operatorname{Gamma}_{d \times d}(\alpha, \lambda)\) distribution is the so-called \(A \Gamma\) distribution, that has been considered in Pérez-Abreu and Stelzer (2014) and generalized to the Hpd setting in (Meier 2018, sec. 2.4). To elaborate, the \(A \Gamma(\eta, \omega, \Sigma)\) distribution is defined with the parameters \(\eta>d-1, \omega>0\) and \(\Sigma \in\) \(\mathcal{S}_{d}^{+}\)as the \(\operatorname{Gamma}_{d \times d}\left(\alpha_{\eta, \Sigma}, \lambda_{\Sigma}\right)\) distribution, with \[ \alpha_{\eta, \boldsymbol{\Sigma}}(d \boldsymbol{U})=|\boldsymbol{\Sigma}|^{-\eta} \operatorname{tr}\left(\boldsymbol{\Sigma}^{-1} \boldsymbol{U}\right)^{-d \eta} \Gamma(d \eta) \tilde{\Gamma}_{d}(\eta)^{-1}|\boldsymbol{U}|^{\eta-d} d \boldsymbol{U}, \] where \(\Gamma\) denotes the Gamma function and \(\tilde{\Gamma}_{d}\) the complex multivariate Gamma function (see Mathai and Provost 2005), and \(\lambda_{\boldsymbol{\Sigma}}(\boldsymbol{U})=\operatorname{tr}\left(\boldsymbol{\Sigma}^{-1} \boldsymbol{U}\right)\). It has the advantage that for \(\boldsymbol{X} \sim A \Gamma(\eta, \omega, \Sigma)\), the formulas for mean and covariance structure are explicitly known: \[ \mathrm{E} \boldsymbol{X}=\frac{\omega}{d} \boldsymbol{\Sigma}, \quad \operatorname{Cov} \boldsymbol{X}=\frac{\omega}{d(\eta d+1)}\left(\eta \boldsymbol{I}_{d^{2}}+\boldsymbol{H}\right)(\boldsymbol{\Sigma} \otimes \boldsymbol{\Sigma}), \] where \(\boldsymbol{H}=\sum_{i, j=1}^{d} \boldsymbol{H}_{i, j} \otimes H_{j, i}\) and \(\boldsymbol{H}_{i, j}\) being the matrix having a one at \((i, j)\) and zeros elsewhere, see (Meier 2018 Lemma 2.8). Thus the A-distribution is particularly well suited for Bayesian prior modeling if the prior knowledge is given in terms of mean and covariance structure.

References

Barndorff-Nielsen, Ole E., Makoto Maejima, and Ken-Iti Sato. 2006. Some Classes of Multivariate Infinitely Divisible Distributions Admitting Stochastic Integral Representations.” Bernoulli 12 (1): 1–33.
Barndorff-Nielsen, Ole E., Jan Pedersen, and Ken-Iti Sato. 2001. Multivariate Subordination, Self-Decomposability and Stability.” Advances in Applied Probability 33 (1): 160–87.
Bladt, Mogens, and Bo Friis Nielsen. 2010. Multivariate Matrix-Exponential Distributions.” Stochastic Models 26 (1): 1–26.
Buchmann, Boris, Benjamin Kaehler, Ross Maller, and Alexander Szimayer. 2015. Multivariate Subordination Using Generalised Gamma Convolutions with Applications to V.G. Processes and Option Pricing.” arXiv:1502.03901 [Math, q-Fin], February.
Das, Sourish, and Dipak K. Dey. 2010. On Bayesian Inference for Generalized Multivariate Gamma Distribution.” Statistics & Probability Letters 80 (19-20): 1492–99.
Foti, Nicholas, Joseph Futoma, Daniel Rockmore, and Sinead Williamson. 2013. A Unifying Representation for a Class of Dependent Random Measures.” In Artificial Intelligence and Statistics, 20–28.
Griffin, J.E. 2011. The Ornstein–Uhlenbeck Dirichlet Process and Other Time-Varying Processes for Bayesian Nonparametric Inference.” Journal of Statistical Planning and Inference 141 (11): 3648–64.
Grigelionis, Bronius. 2013. Student’s t-Distribution and Related Stochastic Processes. SpringerBriefs in Statistics. Berlin, Heidelberg: Springer Berlin Heidelberg.
Grunwald, G K, R J Hyndman, and L M Tedesco. n.d. “A Unified View of Linear AR(1) Models,” 26.
Kirch, Claudia, Matthew C. Edwards, Alexander Meier, and Renate Meyer. 2019. Beyond Whittle: Nonparametric Correction of a Parametric Likelihood with a Focus on Bayesian Time Series Analysis.” Bayesian Analysis 14 (4): 1037–73.
Laverny, Oskar, Esterina Masiello, Véronique Maume-Deschamps, and Didier Rullière. 2021. Estimation of Multivariate Generalized Gamma Convolutions Through Laguerre Expansions.” arXiv:2103.03200 [Math, Stat], July.
Lawrence, Neil D., and Raquel Urtasun. 2009. Non-Linear Matrix Factorization with Gaussian Processes.” In Proceedings of the 26th Annual International Conference on Machine Learning, 601–8. ICML ’09. New York, NY, USA: ACM.
Liou, Jun-Jih, Yuan-Fong Su, Jie-Lun Chiang, and Ke-Sheng Cheng. 2011. Gamma Random Field Simulation by a Covariance Matrix Transformation Method.” Stochastic Environmental Research and Risk Assessment 25 (2): 235–51.
Mathai, A. M., and P. G. Moschopoulos. 1991. On a Multivariate Gamma.” Journal of Multivariate Analysis 39 (1): 135–53.
Mathai, A. M., and Serge B. Provost. 2005. Some Complex Matrix-Variate Statistical Distributions on Rectangular Matrices.” Linear Algebra and Its Applications, Tenth Special Issue (Part 2) on Linear Algebra and Statistics, 410 (November): 198–216.
Mathal, A. M., and P. G. Moschopoulos. 1992. A Form of Multivariate Gamma Distribution.” Annals of the Institute of Statistical Mathematics 44 (1): 97–106.
Meier, Alexander. 2018. A matrix Gamma process and applications to Bayesian analysis of multivariate time series.”
Meier, Alexander, Claudia Kirch, Matthew C. Edwards, and Renate Meyer. 2019. beyondWhittle: Bayesian Spectral Inference for Stationary Time Series (version 1.1.1).
Meier, Alexander, Claudia Kirch, and Renate Meyer. 2020. Bayesian Nonparametric Analysis of Multivariate Time Series: A Matrix Gamma Process Approach.” Journal of Multivariate Analysis 175 (January): 104560.
Pérez-Abreu, Victor, and Robert Stelzer. 2014. Infinitely Divisible Multivariate and Matrix Gamma Distributions.” Journal of Multivariate Analysis 130 (September): 155–75.
Pfaffel, Oliver. 2012. Wishart Processes.” arXiv:1201.3256 [Math], January.
Ranganath, Rajesh, and David M. Blei. 2018. Correlated Random Measures.” Journal of the American Statistical Association 113 (521): 417–30.
Rao, Vinayak, and Yee Whye Teh. 2009. “Spatial Normalized Gamma Processes.” In Proceedings of the 22nd International Conference on Neural Information Processing Systems, 1554–62. NIPS’09. Red Hook, NY, USA: Curran Associates Inc.
Sato, Ken-iti. 1999. Lévy Processes and Infinitely Divisible Distributions. Cambridge University Press.
Semeraro, Patrizia. 2008. A Multivariate Variance Gamma Model for Financial Applications.” International Journal of Theoretical and Applied Finance 11 (01): 1–18.
Shah, Amar, Andrew Wilson, and Zoubin Ghahramani. 2014. Student-t Processes as Alternatives to Gaussian Processes.” In Artificial Intelligence and Statistics, 877–85. PMLR.
Sim, C.H. 1993. Generation of Poisson and Gamma Random Vectors with Given Marginals and Covariance Matrix.” Journal of Statistical Computation and Simulation 47 (1-2): 1–10.
Singpurwalla, Nozer D., and Mark A. Youngren. 1993. Multivariate Distributions Induced by Dynamic Environments.” Scandinavian Journal of Statistics 20 (3): 251–61.
Thibaux, Romain, and Michael I. Jordan. 2007. Hierarchical Beta Processes and the Indian Buffet Process.” In Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 564–71. PMLR.
Tracey, Brendan D., and David H. Wolpert. 2018. Upgrading from Gaussian Processes to Student’s-T Processes.” 2018 AIAA Non-Deterministic Approaches Conference, January.
Walker, Stephen G. 2021. On Infinitely Divisible Multivariate Gamma Distributions.” Communications in Statistics - Theory and Methods 0 (0): 1–7.
Warren, D. 1992. A Multivariate Gamma Distribution Arising from a Markov Model.” Stochastic Hydrology and Hydraulics 6 (3): 183–90.
Warren, David. 1986. Outflow Skewness in Non-Seasonal Linear Reservoirs with Gamma-Distributed Markovian Inflows.” Journal of Hydrology 85 (1): 127–37.
Wilson, Andrew Gordon, and Zoubin Ghahramani. 2011. Generalised Wishart Processes.” In Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, 736–44. UAI’11. Arlington, Virginia, United States: AUAI Press.
Wolpert, R., and Katja Ickstadt. 1998. Poisson/Gamma Random Field Models for Spatial Statistics.” Biometrika 85 (2): 251–67.
Wolpert, Robert L. 2021. Lecture Notes on Stationary Gamma Processes.” arXiv:2106.00087 [Math], May.
Xuan, Junyu, Jie Lu, Guangquan Zhang, Richard Yi Da Xu, and Xiangfeng Luo. 2015. Nonparametric Relational Topic Models Through Dependent Gamma Processes.” arXiv:1503.08542 [Cs, Stat], March.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.