Delta methods, influence functions, and so on. Convolution theorems, local asymptotic minimax theorems.

A convenient feature of M-estimation, and especially maximum likelihood esteimation is simple behaviour of estimators in the asymptotic large-sample-size limit, which can give you, e.g. variance estimates, or motivate information criteria, or robust statistics, optimisation etc.

In the most celebrated and convenient cases case asymptotic bounds are about normally-distributed errors, and these are typically derived through *Local Asymptotic Normality* theorems.
A simple and general introduction is given in Andersen et al. (1997) page 594., which applies to both i.i.d. data and dependent_data in the form of
point processes.
For all that it is applied, it is still stringent.

## Fisher Information

Used in ML theory and kinda-sorta in robust estimation. A matrix that tells you how much a new datum affects your parameter estimates. (It is related, I am told, to garden-variety Shannon information, and when that non-obvious fact is more clear to me I shall expand how precisely this is so.) 🏗

## Convolution Theorem

The unhelpfully-named convolution theorem of Hájek (1970).

Suppose \(\hat{\theta}\) is an efficient estimator of \(\theta\) and \(\tilde{\theta}\) is another, not fully efficient, estimator. The convolution theorem says that, if you rule out stupid exceptions, asymptotically \(\tilde{\theta} = \hat{\theta} + \varepsilon\) where \(\varepsilon\) is pure noise, independent of \(\hat{\theta}.\)

The reason that’s almost obvious is that if it weren’t true, there would be some information about \(\theta\) in \(\tilde{\theta}-\hat{\theta}\), and you could use this information to get a better estimator than \(\hat{\theta}\), which (by assumption) can’t happen. The stupid exceptions are things like the Hodges superefficient estimator that do better at a few values of \(\hat{\theta}\) but much worse at neighbouring values.

## References

*Statistical models based on counting processes*. Corr. 2. print. Springer series in statistics. New York, NY: Springer.

*Sankhyā: The Indian Journal of Statistics, Series A (1961-2002)*39 (2): 101–23.

*International Statistical Review / Revue Internationale de Statistique*62 (1): 133–65.

*The Annals of Probability*32 (1): 730–56.

*Bernoulli*1 (1/2): 17–39.

*Asymptotic Theory of Statistics and Probability*. Springer Texts in Statistics. New York: Springer New York.

*Stochastic Processes and Their Applications*125 (4): 1195–1217.

*Advances in Applied Probability*8 (4): 712–36.

*arXiv:1706.07180 [Cs, Math, Stat]*, June.

*Zeitschrift Für Wahrscheinlichkeitstheorie Und Verwandte Gebiete*14 (4): 323–30.

*Selected Works of C.C. Heyde*, edited by Ross Maller, Ishwar Basawa, Peter Hall, and Eugene Seneta, 214–35. Selected Works in Probability and Statistics. Springer New York.

*The Annals of Statistics*38 (3): 1478–1545.

*Limit Theorems for Stochastic Processes*. Vol. 288. Grundlehren Der Mathematischen Wissenschaften. Berlin, Heidelberg: Springer Berlin Heidelberg.

*arXiv:1610.01353 [Math, Stat]*, October.

*Biometrika*83 (4): 875–90.

*Journal of Statistical Planning and Inference*, C.R. Rao 80th Birthday Felicitation Volume, Part IV, 114 (1–2): 45–61.

*Biometrika*101 (1): 141–54.

*The Annals of Mathematical Statistics*41 (3): 802–28.

*Bernoulli*20 (4): 2020–38.

*Annals of the Institute of Statistical Mathematics*30 (1): 243–61.

*Proceedings of the National Academy of Sciences of the United States of America*83 (3): 541–45.

*The Econometrics Journal*3 (2): 123–47.

*An Introduction to Matrix Concentration Inequalities*.

## No comments yet. Why not leave one?