Extreme value theory

On the decay of awfulness with oftenness

January 13, 2020 — June 30, 2021

point processes

In a satisfying way, it turns out that there are only so many shapes that probability densities can assume as they head off toward infinity. Extreme value theory makes this notion precise, and gives us some tools to work with them. Important application: understanding the kinds of heavy tailed variables we can observe in nature.

See also densities and intensities, survival analysis.


Figure 1

\[\renewcommand{\var}{\operatorname{Var}} \renewcommand{\dd}{\mathrm{d}} \renewcommand{\bb}[1]{\mathbb{#1}} \renewcommand{\vv}[1]{\boldsymbol{#1}} \renewcommand{\rv}[1]{\mathsf{#1}} \renewcommand{\gvn}{\mid} \renewcommand{\Ex}{\mathbb{E}} \renewcommand{\Pr}{\mathbb{P}}\]

1 Tail limit theorems

The main result of use to our ends from EVT is the Pickands-Balkema-de Haan theorem (Balkema and de Haan 1974; Pickands III 1975).

This tells us that we can find a function \(\beta(u)\) such that \[\lim _{u \rightarrow t_{T}} \sup _{0 \leq t<t_{T}-u}\left|T_{u}(t)-G_{\nu, \beta(u),0}(t)\right|=0\] if (and only if) \(T\) is in the maximal domain of attraction of the extreme value distribution with parameter \(\nu\) for some \(\nu\in\bb{R}\).

This maximal domain of attraction was introduced in the Fisher-Tippett theorem (Fisher and Tippett 1928), and is analysed in the EVT literature (e.g. Embrechts, Kluppelberg, and Mikosch 1997). It is pretty hard to find a distribution that does not fit in the MDA. I should try.

Practically this means that for many purposes, the tails of a random variable \[ \rv{t} _may as well_ be assumed to be a GPD, $\rv{t}\sim G_{\nu,\beta,\mu}(t) := 1-\left(1+\frac{\nu (t-\mu)}{\beta}\right)^{-1/\nu}.$ Then for $t> s\geq 0$ and assuming that $G_{\nu,\beta,\mu}(s)>0,$ the survival probability over an interval $(s,t]$ is \] \[\begin{aligned} \Pr[\rv{t}\geq t\gvn \rv{t}> s] &=\frac{\Pr[\rv{t}\geq t\cap \rv{t}> s]}{\Pr[\rv{t}> s]}\\ &=\frac{\Pr[\rv{t}\geq t]}{\Pr[\rv{t}> s]}\\ &=\frac{\bar{G}_{\nu,\beta,\mu}(t)}{\bar{G}_{\nu,\beta,\mu}(s)}\\ &=\frac{\left(1+\frac{\nu (t-\mu)}{\beta}\right)^{-1/\nu}}{\left(1+\frac{\nu (s-\mu)}{\beta}\right)^{-1/\nu}}\\ &=\left(\frac{\beta+\nu (s-\mu)}{\beta+\nu (t-\mu)}\right)^{1/\nu}.\label{eq:gpd-survival-prob} \end{aligned}\]


2 Generalized Pareto Distribution

Figure 2

Best intro from Hosking and Wallis (1987):

The generalized Pareto distribution is the distribution of a random variable ( $$ defined by \(\rv{x}= \alpha\left(1-e^{-k \rv{y}}\right) / k,\) where \(\rv{y}\) is a random variable with the standard exponential distribution. The generalized Pareto distribution has distribution function

\[ \begin{aligned} F(x) &=1-(1-k x / \alpha)^{1 / k}, & & k \neq 0 \\ &=1-\exp (-x / \alpha), & & k=0 \end{aligned} \] and density function

\[ \begin{aligned} f(x) &=\alpha^{-1}(1-k x / \alpha)^{1 / k-1}, & & k \neq 0 \\ &=\alpha^{-1} \exp (-x / \alpha), & & k=0 \end{aligned} \] the range of \(x\) is \(0 \leq x<\infty\) for \(k \leq 0\) and \(0 \leq x \leq \alpha / k\) for \(k>0 .\) The parameters of the distribution are \(\alpha,\) the scale parameter, and \(k,\) the shape parameter. The special cases \(k=0\) and \(k=1\) yield, respectively, the exponential distribution with mean \(\alpha\) and the uniform distribution on \([0, \alpha] ;\) Pareto distributions are obtained when \(k<0 .\)


  1. The failure rate \(r(x)=f(x) /\{1-F(x)\}\) is given by \(r(x)=1 /(\alpha-k x)\) and is monotonic in \(x,\) decreasing if \(k<0,\) constant if \(k=0,\) and increasing if \(k>0\)
  2. If the random variable \(X\) has a generalized Pareto distribution, then the conditional distribution of \(X-t\) given \(X \geq t\) is also generalized Pareto, with the same value of \(k\)
  3. Let \(Z=\max \left(0, X_{1}, \ldots, X_{N}\right),\) where the \(X_{i}\) are independent and identically distributed as (1) and \(N\) has a Poisson distribution. Then \(Z\) has, essentially, a generalized extreme value (GEV) distribution as defined by Jenkinson (1955); that is, there exist quantities \(\beta, \gamma,\) and \(\delta,\) independent of \(z,\) such that

\[ \begin{aligned} F_{Z}(z) &=\operatorname{Pr}(Z \leq z) \\ &=\exp \left[-\{1-\delta(z-\gamma) / \beta\}^{1 / \delta}\right], \quad z \geq 0 \end{aligned} \] furthermore, \(\delta=k ;\) that is, the shape parameters of the GEV and the GPD are equal.

3 Generalized Extreme Value distributions


4 Burr distribution


5 References

Balkema, and de Haan. 1974. Residual Life Time at Great Age.” The Annals of Probability.
Beranger, Stephenson, and Sisson. 2021. High-Dimensional Inference Using the Extremal Skew-t Process.” Extremes.
Bhatti, Hussain, Ahmad, et al. 2018. Efficient Estimation of Pareto Model: Some Modified Percentile Estimators.” PLOS ONE.
Castillo, and Hadi. 1997. Fitting the Generalized Pareto Distribution to Data.” Journal of the American Statistical Association.
Charpentier, and Flachaire. 2019. Pareto Models for Risk Management.” arXiv:1912.11736 [Econ, Stat].
Dargahi-Noubary. 1989. On Tail Estimation: An Improved Method.” Mathematical Geology.
Davison. 1984. Modelling Excesses over High Thresholds, with an Application.” In Statistical Extremes and Applications. NATO ASI Series.
Embrechts, Kluppelberg, and Mikosch. 1997. Extremal Events in Finance and Insurance.
Embrechts, Klüppelberg, and Mikosch. 1997. Risk Theory.” In Modelling Extremal Events. Applications of Mathematics 33.
Fisher, and Tippett. 1928. Limiting Forms of the Frequency Distribution of the Largest or Smallest Member of a Sample.” Mathematical Proceedings of the Cambridge Philosophical Society.
Ghitany, Gómez-Déniz, and Nadarajah. 2018. A New Generalization of the Pareto Distribution and Its Application to Insurance Data.” Journal of Risk and Financial Management.
Giesbrecht, and Kempthorne. 1976. Maximum Likelihood Estimation in the Three-Parameter Lognormal Distribution.” Journal of the Royal Statistical Society: Series B (Methodological).
Grimshaw. 1993. Computing Maximum Likelihood Estimates for the Generalized Pareto Distribution.” Technometrics.
Hosking, and Wallis. 1987. Parameter and Quantile Estimation for the Generalized Pareto Distribution.” Technometrics.
Hüsler, Li, and Raschke. 2011. Estimation for the Generalized Pareto Distribution Using Maximum Likelihood and Goodness of Fit.” Communications in Statistics - Theory and Methods.
Lee, and Kim. 2019. Exponentiated Generalized Pareto Distribution: Properties and Applications Towards Extreme Value Theory.” Communications in Statistics - Theory and Methods.
Makarov. 2006. Extreme Value Theory and High Quantile Convergence.” The Journal of Operational Risk.
Markovitch, and Krieger. 2002. The Estimation of Heavy-Tailed Probability Density Functions, Their Mixtures and Quantiles.” Computer Networks.
McNeil, Alexander J. 1997. Estimating the Tails of Loss Severity Distributions Using Extreme Value Theory.” ASTIN Bulletin: The Journal of the IAA.
McNeil, Alexander J, Frey, and Embrechts. 2005. Quantitative Risk Management : Concepts, Techniques and Tools.
Mueller. 2018. Refining the Central Limit Theorem Approximation via Extreme Value Theory.” arXiv:1802.00762 [Math].
Naveau, Hannart, and Ribes. 2020. Statistical Methods for Extreme Event Attribution in Climate Science.” Annual Review of Statistics and Its Application.
Nolde, and Zhou. 2021. Extreme Value Analysis for Financial Risk Management.” Annual Review of Statistics and Its Application.
Pickands III. 1975. Statistical Inference Using Extreme Order Statistics.” The Annals of Statistics.
Smith. 1985. Maximum Likelihood Estimation in a Class of Nonregular Cases.” Biometrika.
Vajda. 1951. Analytical Studies in Stop-Loss Reinsurance.” Scandinavian Actuarial Journal.
Wong, and Li. 2006. A Note on the Estimation of Extreme Value Distributions Using Maximum Product of Spacings.” In Institute of Mathematical Statistics Lecture Notes - Monograph Series.
Zhao, Zhang, Cheng, et al. 2019. A New Parameter Estimator for the Generalized Pareto Distribution Under the Peaks over Threshold Framework.” Mathematics.