Loosely, estimating a quantity by choosing it to be the extremum of a function, or, if it’s well-behaved enough, a zero of its derivative.

Popular with machine learning, where loss-function based methods are ubiquitous. In statistics we see this famously in maximum likelihood estimation and robust estimation, and least squares loss, for which M-estimation provides a unifying formalism with a convenient large sample asymptotic theory.

πŸ— Discuss influence function motivation.

Implied density functions

Common loss function imply a density considered as a maximum likelihood estimation problem.

I assume they did not invent this idea, but Davison and Ortiz (2019) points out that if you have a least-squares-compatible model, usually it can generalise it to any elliptical density, which includes Huber losses and many robust ones as special cases.

Robust Loss functions


Huber loss

Hampel loss


Discuss representation (and implementation) in terms of weight functions for least-squares loss.


Mallows, Schweppe etc.



Advani, Madhu, and Surya Ganguli. 2016. β€œAn Equivalence Between High Dimensional Bayes Optimal Inference and M-Estimation.” In Advances In Neural Information Processing Systems.
Barndorff-Nielsen, O. 1983. β€œOn a Formula for the Distribution of the Maximum Likelihood Estimator.” Biometrika 70 (2): 343–65.
BΓΌhlmann, Peter. 2014. β€œRobust Statistics.” In Selected Works of Peter J. Bickel, edited by Jianqing Fan, Ya’acov Ritov, and C. F. Jeff Wu, 51–98. Selected Works in Probability and Statistics 13. Springer New York.
DasGupta, Anirban. 2008. Asymptotic Theory of Statistics and Probability. Springer Texts in Statistics. New York: Springer New York.
Davison, Andrew J., and Joseph Ortiz. 2019. β€œFutureMapping 2: Gaussian Belief Propagation for Spatial AI.” arXiv:1910.14139 [Cs], October.
Donoho, David L., and Andrea Montanari. 2013. β€œHigh Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing.” arXiv:1310.7320 [Cs, Math, Stat], October.
Geer, Sara van de. 2014. β€œWorst Possible Sub-Directions in High-Dimensional Models.” In arXiv:1403.7023 [Math, Stat]. Vol. 131.
Hampel, Frank R. 1974. β€œThe Influence Curve and Its Role in Robust Estimation.” Journal of the American Statistical Association 69 (346): 383–93.
Hampel, Frank R., Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. 2011. Robust Statistics: The Approach Based on Influence Functions. John Wiley & Sons.
Huber, Peter J. 1964. β€œRobust Estimation of a Location Parameter.” The Annals of Mathematical Statistics 35 (1): 73–101.
Kandasamy, Kirthevasan, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, and James M. Robins. 2014. β€œInfluence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations.” arXiv:1411.4342 [Stat], November.
KΓΌmmel, Reiner. 1982. β€œThe Impact of Energy on Industrial Growth.” Energy 7 (2): 189–203.
Markatou, Marianthi, Dimitrios Karlis, and Yuxin Ding. 2021. β€œDistance-Based Statistical Inference.” Annual Review of Statistics and Its Application 8 (1): 301–27.
Markatou, M., and E. Ronchetti. 1997. β€œ3 Robust Inference: The Approach Based on Influence Functions.” In Handbook of Statistics, 15:49–75. Robust Inference. Elsevier.
Maronna, Ricardo Antonio. 1976. β€œRobust M-Estimators of Multivariate Location and Scatter.” The Annals of Statistics 4 (1): 51–67.
Mondal, Debashis, and Donald B. Percival. 2010. β€œM-Estimation of Wavelet Variance.” Annals of the Institute of Statistical Mathematics 64 (1): 27–53.
Ortiz, Joseph, Talfan Evans, and Andrew J. Davison. 2021. β€œA Visual Introduction to Gaussian Belief Propagation.” arXiv:2107.02308 [Cs], July.
Ronchetti, E. 2000. β€œRobust Regression Methods and Model Selection.” In Data Segmentation and Model Selection for Computer Vision, edited by Alireza Bab-Hadiashar and David Suter, 31–40. Springer New York.
Ronchetti, Elvezio. 1997. β€œRobust Inference by Influence Functions.” Journal of Statistical Planning and Inference, Robust Statistics and Data Analysis, Part I, 57 (1): 59–72.
Tharmaratnam, Kukatharmini, and Gerda Claeskens. 2013. β€œA Comparison of Robust Versions of the AIC Based on M-, S- and MM-Estimators.” Statistics 47 (1): 216–35.
Yang, Tao, Colin M. Gallagher, and Christopher S. McMahan. 2019. β€œA Robust Regression Methodology via M-Estimation.” Communications in Statistics - Theory and Methods 48 (5): 1092–1107.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.