Order statistics

2019-02-20 — 2020-03-17

Wherein the ranks of a sample are defined, and a representation for sums of the top-k of iid exponential variates via quantile transforms is exhibited, and connections to maxima and copulas are sketched.

ordinal

probability

regression

statistics

For a sample of independent observations \(X_{1}, X_{2}, \ldots, X_{n}\) with common distribution \(F\) the ordered sample values

\[X_{(1)} \leq X_{(2)} \leq \cdots \leq X_{(n)}\] are called the order statistics.

Todo: connection to maximum processes, learning ranking, simplex…

Hung Chen’s notes are good.

Gwern did some fun engineering of order statistics, which edges around some general properties of joint maximal statistics of elliptical copulas.

My one-weird-trick in this domain is for sums of top-\(k\)th of \(N\) i.i.d. exponential random variables, which turn out to have a simple representation in terms of \(k\) random exponentials (Nagaraja 2006). The magic is that quantile transforms make this into a very general way of doing cheap order statistics for i.i.d. variables.

1 References

Nagaraja. 2006. “Order Statistics from Independent Exponential Random Variables and the Sum of the Top Order Statistics.” In Advances in Distribution Theory, Order Statistics, and Inference.