M-estimation

July 11, 2016 β€” February 17, 2022

Loosely, estimating a quantity by choosing it to be the extremum of a function, or, if it’s well-behaved enough, a zero of its derivative.

Popular with machine learning, where loss-function based methods are ubiquitous. In statistics we see this famously in maximum likelihood estimation and robust estimation, and least squares loss, for which M-estimation provides a unifying formalism with a convenient large sample asymptotic theory.

πŸ— Discuss influence function motivation.

1 Implied density functions

Common loss function imply a density considered as a maximum likelihood estimation problem.

I assume they did not invent this idea, but Davison and Ortiz (2019) points out that if you have a least-squares-compatible model, usually it can generalise it to any elliptical density, which includes Huber losses and many robust ones as special cases.

2 Robust Loss functions

πŸ—

2.1 Huber loss

2.2 Hampel loss

3 Fitting

Discuss representation (and implementation) in terms of weight functions for least-squares loss.

4 GM-estimators

Mallows, Schweppe etc.

πŸ—

5 References

Advani, and Ganguli. 2016. β€œAn Equivalence Between High Dimensional Bayes Optimal Inference and M-Estimation.” In Advances In Neural Information Processing Systems.
Barndorff-Nielsen. 1983. β€œOn a Formula for the Distribution of the Maximum Likelihood Estimator.” Biometrika.
BΓΌhlmann. 2014. β€œRobust Statistics.” In Selected Works of Peter J. Bickel. Selected Works in Probability and Statistics 13.
DasGupta. 2008. Asymptotic Theory of Statistics and Probability. Springer Texts in Statistics.
Davison, and Ortiz. 2019. β€œFutureMapping 2: Gaussian Belief Propagation for Spatial AI.” arXiv:1910.14139 [Cs].
Donoho, and Montanari. 2013. β€œHigh Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing.” arXiv:1310.7320 [Cs, Math, Stat].
Hampel. 1974. β€œThe Influence Curve and Its Role in Robust Estimation.” Journal of the American Statistical Association.
Hampel, Ronchetti, Rousseeuw, et al. 2011. Robust Statistics: The Approach Based on Influence Functions.
Huber. 1964. β€œRobust Estimation of a Location Parameter.” The Annals of Mathematical Statistics.
Kandasamy, Krishnamurthy, Poczos, et al. 2014. β€œInfluence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations.” arXiv:1411.4342 [Stat].
KΓΌmmel. 1982. β€œThe Impact of Energy on Industrial Growth.” Energy.
Markatou, Marianthi, Karlis, and Ding. 2021. β€œDistance-Based Statistical Inference.” Annual Review of Statistics and Its Application.
Markatou, M., and Ronchetti. 1997. β€œ3 Robust Inference: The Approach Based on Influence Functions.” In Handbook of Statistics. Robust Inference.
Maronna. 1976. β€œRobust M-Estimators of Multivariate Location and Scatter.” The Annals of Statistics.
Mondal, and Percival. 2010. β€œM-Estimation of Wavelet Variance.” Annals of the Institute of Statistical Mathematics.
Ortiz, Evans, and Davison. 2021. β€œA Visual Introduction to Gaussian Belief Propagation.” arXiv:2107.02308 [Cs].
Ronchetti, Elvezio. 1997. β€œRobust Inference by Influence Functions.” Journal of Statistical Planning and Inference, Robust Statistics and Data Analysis, Part I,.
Ronchetti, E. 2000. β€œRobust Regression Methods and Model Selection.” In Data Segmentation and Model Selection for Computer Vision.
Tharmaratnam, and Claeskens. 2013. β€œA Comparison of Robust Versions of the AIC Based on M-, S- and MM-Estimators.” Statistics.
van de Geer. 2014. β€œWorst Possible Sub-Directions in High-Dimensional Models.” In arXiv:1403.7023 [Math, Stat].
Yang, Gallagher, and McMahan. 2019. β€œA Robust Regression Methodology via M-Estimation.” Communications in Statistics - Theory and Methods.