(Outlier) robust statistics

Terminology note: I mean robust statistics in the sense of Huber, which is, informally, outlier robustness.

There are also robust estimators in econometrics; then it means something about good behaviour under heteroskedastic and/or correlated error. Robust Bayes means something about inference that is robust to the choice of prior (which could overlap but is a rather different emphasis).

Outlier robustness is AFAICT more-or-less a frequentist project. Bayesian approaches seem to achieve robustness largely by choosing heavy-tailed priors or heavy-tailed noise distributions where they might have chosen light-tailed ones, e.g. Laplacian distributions instead of Gaussian ones. Such heavy-tailed distributions may have arbitrary prior parameters, but not more arbitrary than usual in Bayesian statistics and therefore do not attract so much need to wash away the guilt as frequentists seem to feel.

One can off course use heavy-tailed noise distributions in frequentist inference as well and that will buy a kind of robustness. That seems to be unpopular due to making frequentist inference as difficult as Bayesian inference.

Corruption models

  • Random (mixture) corruption
  • (Adversarial) total variation \(\epsilon\)-corruption.
  • wasserstein corruption models (does one usually assume adversarial here or random) as seen in β€œdistributionally robust” models.
  • other?

M-estimation with robust loss

The one that I, at least, would think of when considering robust estimation.

In M-estimation, instead of hunting an maximum of the likelihood function as you do in maximum likelihood, or a minimum of the sum of squared residuals, as you do in least-squares estimation, you minimise a specifically chosen loss function for those residuals. You may select an objective function more robust to deviations between your model and reality. Credited to Huber (1964).

See M-estimation for some details.

AFAICT, the definition of M-estimation includes the possibility that you could in principle select a less-robust loss function than least sum-of-squares but I have not seen this in the literature. Generally, some robustified approach is presumed, which penalises outliers less severly than least-squares.

For M-estimation as robust estimation, various complications ensue, such as the different between noise in your predictors, noise in your regressors, and whether the β€œtrue” model is included in your class, and which of these difficulties you have resolved or not.

Loosely speaking, no, you haven’t solved problems of noise in your predictors, only the problem of noise in your responses.

And the cost is that you now have a loss function with some extra arbitrary parameters in which you have to justify, which is anathema to frequentists, who like to claim to be less arbitrary than Bayesians.

Huber loss


πŸ— Don’t know

Median-based estimators

Rousseeuw and Yohai’s idea (P. Rousseeuw and Yohai 1984)

Many permutations on the theme here, but it rapidly gets complex. The only one of these families I have looked into are the near trivial cases of the Least Median Of Squares and Least Trimmed Squares estimations. (P. J. Rousseeuw 1984) More broadly we should also consider S-estimators, which do something with… robust estimation of scale and using this to do robust estimation of location? πŸ—

Theil-Sen-(Oja) estimators: Something about medians of inferred regression slopes. πŸ—

Tukey median, and why no-one uses it what with it being NP-Hard.


RANSAC β€” some kind of randomised outlier detection estimator. πŸ—



Barndorff-Nielsen, O. 1983. β€œOn a Formula for the Distribution of the Maximum Likelihood Estimator.” Biometrika 70 (2): 343–65.
Beran, Rudolf. 1981. β€œEfficient Robust Estimates in Parametric Models.” Zeitschrift FΓΌr Wahrscheinlichkeitstheorie Und Verwandte Gebiete 55 (1): 91–108.
β€”β€”β€”. 1982. β€œRobust Estimation in Models for Independent Non-Identically Distributed Data.” The Annals of Statistics 10 (2): 415–28.
Bickel, P. J. 1975. β€œOne-Step Huber Estimates in the Linear Model.” Journal of the American Statistical Association 70 (350): 428–34.
Bondell, Howard D., Arun Krishna, and Sujit K. Ghosh. 2010. β€œJoint Variable Selection for Fixed and Random Effects in Linear Mixed-Effects Models.” Biometrics 66 (4): 1069–77.
BΓΌhlmann, Peter. 2014. β€œRobust Statistics.” In Selected Works of Peter J. Bickel, edited by Jianqing Fan, Ya’acov Ritov, and C. F. Jeff Wu, 51–98. Selected Works in Probability and Statistics 13. Springer New York.
Burman, P., and D. Nolan. 1995. β€œA General Akaike-Type Criterion for Model Selection in Robust Regression.” Biometrika 82 (4): 877–86.
Cantoni, Eva, and Elvezio Ronchetti. 2001. β€œRobust Inference for Generalized Linear Models.” Journal of the American Statistical Association 96 (455): 1022–30.
Charikar, Moses, Jacob Steinhardt, and Gregory Valiant. 2016. β€œLearning from Untrusted Data.” arXiv:1611.02315 [Cs, Math, Stat], November.
Cox, D. R. 1983. β€œSome Remarks on Overdispersion.” Biometrika 70 (1): 269–74.
Czellar, Veronika, and Elvezio Ronchetti. 2010. β€œAccurate and Robust Tests for Indirect Inference.” Biometrika 97 (3): 621–30.
Davison, Andrew J., and Joseph Ortiz. 2019. β€œFutureMapping 2: Gaussian Belief Propagation for Spatial AI.” arXiv:1910.14139 [Cs], October.
Diakonikolas, Ilias, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2017. β€œBeing Robust (in High Dimensions) Can Be Practical.” arXiv:1703.00893 [Cs, Math, Stat], March.
Diakonikolas, Ilias, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2016. β€œRobust Estimators in High Dimensions Without the Computational Intractability.” arXiv:1604.06443 [Cs, Math, Stat], April.
Donoho, David L., and Peter J. Huber. 1983. β€œThe Notion of Breakdown Point.” A Festschrift for Erich L. Lehmann 157184.
Donoho, David L., and Richard C. Liu. 1988. β€œThe β€˜Automatic’ Robustness of Minimum Distance Functionals.” The Annals of Statistics 16 (2): 552–86.
Donoho, David L., and Andrea Montanari. 2013. β€œHigh Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing.” arXiv:1310.7320 [Cs, Math, Stat], October.
Duchi, John, Peter Glynn, and Hongseok Namkoong. 2016. β€œStatistics of Robust Optimization: A Generalized Empirical Likelihood Approach.” arXiv:1610.03425 [Stat], October.
Genton, Marc G, and Elvezio Ronchetti. 2003. β€œRobust Indirect Inference.” Journal of the American Statistical Association 98 (461): 67–76.
Ghosh, Abhik, and Ayanendranath Basu. 2016. β€œGeneral Model Adequacy Tests and Robust Statistical Inference Based on A New Family of Divergences.” arXiv:1611.05224 [Math, Stat], November.
Golubev, Grigori K., and Michael Nussbaum. 1990. β€œA Risk Bound in Sobolev Class Regression.” The Annals of Statistics 18 (2): 758–78.
Hampel, Frank R. 1974. β€œThe Influence Curve and Its Role in Robust Estimation.” Journal of the American Statistical Association 69 (346): 383–93.
Hampel, Frank R., Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. 2011. Robust Statistics: The Approach Based on Influence Functions. John Wiley & Sons.
Holland, Paul W., and Roy E. Welsch. 1977. β€œRobust Regression Using Iteratively Reweighted Least-Squares.” Communications in Statistics - Theory and Methods 6 (9): 813–27.
Huang, Shih-Ting, and Johannes Lederer. 2022. β€œDeepMoM: Robust Deep Learning With Median-of-Means.” Journal of Computational and Graphical Statistics 0 (0): 1–15.
Huber, Peter J. 1964. β€œRobust Estimation of a Location Parameter.” The Annals of Mathematical Statistics 35 (1): 73–101.
β€”β€”β€”. 2009. Robust Statistics. 2nd ed. Wiley Series in Probability and Statistics. Hoboken, N.J: Wiley.
JankovΓ‘, Jana, and Sara van de Geer. 2016. β€œConfidence Regions for High-Dimensional Generalized Linear Models Under Sparsity.” arXiv:1610.01353 [Math, Stat], October.
Konishi, Sadanori, and G. Kitagawa. 2008. Information Criteria and Statistical Modeling. Springer Series in Statistics. New York: Springer.
Konishi, Sadanori, and Genshiro Kitagawa. 1996. β€œGeneralised Information Criteria in Model Selection.” Biometrika 83 (4): 875–90.
β€”β€”β€”. 2003. β€œAsymptotic Theory for Information Criteria in Model Selectionβ€”Functional Approach.” Journal of Statistical Planning and Inference, C.R. Rao 80th Birthday Felicitation Volume, Part IV, 114 (1–2): 45–61.
Krzakala, Florent, Cristopher Moore, Elchanan Mossel, Joe Neeman, Allan Sly, Lenka ZdeborovΓ‘, and Pan Zhang. 2013. β€œSpectral Redemption in Clustering Sparse Networks.” Proceedings of the National Academy of Sciences 110 (52): 20935–40.
Li, Jerry. 2017. β€œRobust Sparse Estimation Tasks in High Dimensions.” arXiv:1702.05860 [Cs], February.
Lu, W., Y. Goldberg, and J. P. Fine. 2012. β€œOn the Robustness of the Adaptive Lasso to Model Misspecification.” Biometrika 99 (3): 717–31.
Machado, JosΓ© A.F. 1993. β€œRobust Model Selection and M-Estimation.” Econometric Theory 9 (03): 478–93.
Manton, J. H., V. Krishnamurthy, and H. V. Poor. 1998. β€œJames-Stein State Filtering Algorithms.” IEEE Transactions on Signal Processing 46 (9): 2431–47.
Markatou, Marianthi, Dimitrios Karlis, and Yuxin Ding. 2021. β€œDistance-Based Statistical Inference.” Annual Review of Statistics and Its Application 8 (1): 301–27.
Markatou, M., and E. Ronchetti. 1997. β€œ3 Robust Inference: The Approach Based on Influence Functions.” In Handbook of Statistics, 15:49–75. Robust Inference. Elsevier.
Maronna, Ricardo A., Douglas Martin, and VΓ­ctor J. Yohai. 2006. Robust statistics: theory and methods. Reprinted with corr. Wiley series in probability and statistics. Chichester: Wiley.
Maronna, Ricardo Antonio. 1976. β€œRobust M-Estimators of Multivariate Location and Scatter.” The Annals of Statistics 4 (1): 51–67.
Maronna, Ricardo A., and VΓ­ctor J. Yohai. 1995. β€œThe Behavior of the Stahel-Donoho Robust Multivariate Estimator.” Journal of the American Statistical Association 90 (429): 330–41.
β€”β€”β€”. 2014. β€œRobust Estimation of Multivariate Location and Scatter.” In Wiley StatsRef: Statistics Reference Online. John Wiley & Sons, Ltd.
Maronna, Ricardo A., and Ruben H. Zamar. 2002. β€œRobust Estimates of Location and Dispersion for High-Dimensional Datasets.” Technometrics 44 (4): 307–17.
Massart, Desire L., Leonard Kaufman, Peter J. Rousseeuw, and Annick Leroy. 1986. β€œLeast Median of Squares: A Robust Method for Outlier and Model Error Detection in Regression and Calibration.” Analytica Chimica Acta 187 (January): 171–79.
Mossel, Elchanan, Joe Neeman, and Allan Sly. 2013. β€œA Proof Of The Block Model Threshold Conjecture.” arXiv:1311.4115 [Cs, Math], November.
β€”β€”β€”. 2016. β€œBelief Propagation, Robust Reconstruction and Optimal Recovery of Block Models.” The Annals of Applied Probability 26 (4): 2211–56.
Oja, Hannu. 1983. β€œDescriptive Statistics for Multivariate Distributions.” Statistics & Probability Letters 1 (6): 327–32.
Ortiz, Joseph, Talfan Evans, and Andrew J. Davison. 2021. β€œA Visual Introduction to Gaussian Belief Propagation.” arXiv:2107.02308 [Cs], July.
Qian, Guoqi. 1996. β€œOn Model Selection in Robust Linear Regression.”
Qian, Guoqi, and R. K. Hans. 1996. β€œSome Notes on Rissanen’s Stochastic Complexity.”
Qian, Guoqi, and Hans R. KΓΌnsch. 1998. β€œOn Model Selection via Stochastic Complexity in Robust Linear Regression.” Journal of Statistical Planning and Inference 75 (1): 91–116.
Ronchetti, E. 2000. β€œRobust Regression Methods and Model Selection.” In Data Segmentation and Model Selection for Computer Vision, edited by Alireza Bab-Hadiashar and David Suter, 31–40. Springer New York.
Ronchetti, Elvezio. 1985. β€œRobust Model Selection in Regression.” Statistics & Probability Letters 3 (1): 21–23.
β€”β€”β€”. 1997. β€œRobust Inference by Influence Functions.” Journal of Statistical Planning and Inference, Robust Statistics and Data Analysis, Part I, 57 (1): 59–72.
Ronchetti, Elvezio, and Fabio Trojani. 2001. β€œRobust Inference with GMM Estimators.” Journal of Econometrics 101 (1): 37–69.
Rousseeuw, Peter J. 1984. β€œLeast Median of Squares Regression.” Journal of the American Statistical Association 79 (388): 871–80.
Rousseeuw, Peter J., and Annick M. Leroy. 1987. Robust Regression and Outlier Detection. Wiley Series in Probability and Mathematical Statistics. New York: Wiley.
Rousseeuw, P., and V. Yohai. 1984. β€œRobust Regression by Means of S-Estimators.” In Robust and Nonlinear Time Series Analysis, edited by JΓΌrgen Franke, Wolfgang HΓ€rdle, and Douglas Martin, 256–72. Lecture Notes in Statistics 26. Springer US.
Royall, Richard M. 1986. β€œModel Robust Confidence Intervals Using Maximum Likelihood Estimators.” International Statistical Review / Revue Internationale de Statistique 54 (2): 221–26.
Stigler, Stephen M. 2010. β€œThe Changing History of Robustness.” The American Statistician 64 (4): 277–81.
Street, James O., Raymond J. Carroll, and David Ruppert. 1988. β€œA Note on Computing Robust Regression Estimates via Iteratively Reweighted Least Squares.” The American Statistician 42 (2): 152–54.
Tharmaratnam, Kukatharmini, and Gerda Claeskens. 2013. β€œA Comparison of Robust Versions of the AIC Based on M-, S- and MM-Estimators.” Statistics 47 (1): 216–35.
Theil, Henri. 1992. β€œA Rank-Invariant Method of Linear and Polynomial Regression Analysis.” In Henri Theil’s Contributions to Economics and Econometrics, edited by Baldev Raj and Johan Koerts, 345–81. Advanced Studies in Theoretical and Applied Econometrics 23. Springer Netherlands.
Tsou, Tsung-Shan. 2006. β€œRobust Poisson Regression.” Journal of Statistical Planning and Inference 136 (9): 3173–86.
Wedderburn, R. W. M. 1974. β€œQuasi-Likelihood Functions, Generalized Linear Models, and the Gaussβ€”Newton Method.” Biometrika 61 (3): 439–47.
Xu, H., C. Caramanis, and S. Mannor. 2010. β€œRobust Regression and Lasso.” IEEE Transactions on Information Theory 56 (7): 3561–74.
Yang, Tao, Colin M. Gallagher, and Christopher S. McMahan. 2019. β€œA Robust Regression Methodology via M-Estimation.” Communications in Statistics - Theory and Methods 48 (5): 1092–1107.
Yang, Wenzhuo, and Huan Xu. 2013. β€œA Unified Robust Regression Model for Lasso-Like Algorithms.” In ICML (3), 585–93.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.