# (Outlier) robust statistics

There are also robust estimators in econometrics; then it means something about good behaviour under heteroskedastic and/or correlated error. Robust Bayes means something about inference that is robust to the choice of prior (which could overlap but is a rather different emphasis).

Outlier robustness is AFAICT more-or-less a frequentist project. Bayesian approaches seem to achieve robustness largely by choosing heavy-tailed priors or heavy-tailed noise distributions where they might have chosen light-tailed ones, e.g. Laplacian distributions instead of Gaussian ones. Such heavy-tailed distributions may have arbitrary prior parameters, but not more arbitrary than usual in Bayesian statistics and therefore do not attract so much need to wash away the guilt as frequentists seem to feel.

One can off course use heavy-tailed noise distributions in frequentist inference as well and that will buy a kind of robustness. That seems to be unpopular due to making frequentist inference as difficult as Bayesian inference.

## Corruption models

• Random (mixture) corruption
• (Adversarial) total variation $$\epsilon$$-corruption.
• wasserstein corruption models (does one usually assume adversarial here or random) as seen in “distributionally robust” models.
• other?

## M-estimation with robust loss

The one that I, at least, would think of when considering robust estimation.

In M-estimation, instead of hunting an maximum of the likelihood function as you do in maximum likelihood, or a minimum of the sum of squared residuals, as you do in least-squares estimation, you minimise a specifically chosen loss function for those residuals. You may select an objective function more robust to deviations between your model and reality. Credited to Huber (1964).

See M-estimation for some details.

AFAICT, the definition of M-estimation includes the possibility that you could in principle select a less-robust loss function than least sum-of-squares or negative log likelihood, but I have not seen this in the literature. Generally, some robustified approach is presumed.

For M-estimation as robust estimation, various complications ensue, such as the different between noise in your predictors, noise in your regressors, and whether the “true” model is included in your class, and which of these difficulties you have resolved or not.

Loosely speaking, no, you haven’t solved problems of noise in your predictors, only the problem of noise in your responses.

And the cost is that you now have a loss function with some extra arbitrary parameters in which you have to justify, which is anathema to frequentists, who like to claim to be less arbitrary than Bayesians. You then have to justify why you chose that loss function and its particular parameterisation. There are various procedures to choose these parameters, however.

🏗 Don’t know

## Median-based estimators

Rousseeuw and Yohai’s idea.

Many permutations on the theme here, but it rapidly gets complex. The only one of these families I have looked into are the near trivial cases of the Least Median Of Squares and Least Trimmed Squares estimations. ] More broadly we should also consider S-estimators, which do something with… robust estimation of scale and using this to do robust estimation of location? 🏗

Theil-Sen-(Oja) estimators: Something about medians of inferred regression slopes. 🏗

Tukey median, and why no-one uses it what with it being NP-Hard.

## Others

RANSAC — some kind of randomised outlier detection estimator. 🏗

## References

Barndorff-Nielsen, O. 1983. Biometrika 70 (2): 343–65.
Beran, Rudolf. 1981. Zeitschrift Für Wahrscheinlichkeitstheorie Und Verwandte Gebiete 55 (1): 91–108.
———. 1982. The Annals of Statistics 10 (2): 415–28.
Bickel, P. J. 1975. Journal of the American Statistical Association 70 (350): 428–34.
Bondell, Howard D., Arun Krishna, and Sujit K. Ghosh. 2010. Biometrics 66 (4): 1069–77.
Bühlmann, Peter. 2014. In Selected Works of Peter J. Bickel, edited by Jianqing Fan, Ya’acov Ritov, and C. F. Jeff Wu, 51–98. Selected Works in Probability and Statistics 13. Springer New York.
Burman, P., and D. Nolan. 1995. Biometrika 82 (4): 877–86.
Cantoni, Eva, and Elvezio Ronchetti. 2001. Journal of the American Statistical Association 96 (455): 1022–30.
Charikar, Moses, Jacob Steinhardt, and Gregory Valiant. 2016. arXiv:1611.02315 [Cs, Math, Stat], November.
Cox, D. R. 1983. Biometrika 70 (1): 269–74.
Czellar, Veronika, and Elvezio Ronchetti. 2010. Biometrika 97 (3): 621–30.
Diakonikolas, Ilias, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2017. arXiv:1703.00893 [Cs, Math, Stat], March.
Diakonikolas, Ilias, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, and Alistair Stewart. 2016. arXiv:1604.06443 [Cs, Math, Stat], April.
Donoho, David L., and Peter J. Huber. 1983. A Festschrift for Erich L. Lehmann 157184.
Donoho, David L., and Richard C. Liu. 1988. The Annals of Statistics 16 (2): 552–86.
Donoho, David L., and Andrea Montanari. 2013. arXiv:1310.7320 [Cs, Math, Stat], October.
Duchi, John, Peter Glynn, and Hongseok Namkoong. 2016. arXiv:1610.03425 [Stat], October.
Genton, Marc G, and Elvezio Ronchetti. 2003. Journal of the American Statistical Association 98 (461): 67–76.
Ghosh, Abhik, and Ayanendranath Basu. 2016. arXiv:1611.05224 [Math, Stat], November.
Golubev, Grigori K., and Michael Nussbaum. 1990. The Annals of Statistics 18 (2): 758–78.
Hampel, Frank R. 1974. Journal of the American Statistical Association 69 (346): 383–93.
Hampel, Frank R., Elvezio M. Ronchetti, Peter J. Rousseeuw, and Werner A. Stahel. 2011. Robust Statistics: The Approach Based on Influence Functions. John Wiley & Sons.
Holland, Paul W., and Roy E. Welsch. 1977. Communications in Statistics - Theory and Methods 6 (9): 813–27.
Huber, Peter J. 1964. The Annals of Mathematical Statistics 35 (1): 73–101.
———. 2009. Robust Statistics. 2nd ed. Wiley Series in Probability and Statistics. Hoboken, N.J: Wiley.
Janková, Jana, and Sara van de Geer. 2016. arXiv:1610.01353 [Math, Stat], October.
Konishi, Sadanori, and G. Kitagawa. 2008. Information Criteria and Statistical Modeling. Springer Series in Statistics. New York: Springer.
Konishi, Sadanori, and Genshiro Kitagawa. 1996. Biometrika 83 (4): 875–90.
———. 2003. Journal of Statistical Planning and Inference, C.R. Rao 80th Birthday Felicitation Volume, Part IV, 114 (1–2): 45–61.
Krzakala, Florent, Cristopher Moore, Elchanan Mossel, Joe Neeman, Allan Sly, Lenka Zdeborová, and Pan Zhang. 2013. Proceedings of the National Academy of Sciences 110 (52): 20935–40.
Li, Jerry. 2017. arXiv:1702.05860 [Cs], February.
LU, W., Y. GOLDBERG, and J. P. FINE. 2012. Biometrika 99 (3): 717–31.
Machado, José A.F. 1993. Econometric Theory 9 (03): 478–93.
Manton, J. H., V. Krishnamurthy, and H. V. Poor. 1998. IEEE Transactions on Signal Processing 46 (9): 2431–47.
Markatou, Marianthi, Dimitrios Karlis, and Yuxin Ding. 2021. Annual Review of Statistics and Its Application 8 (1): 301–27.
Markatou, M., and E. Ronchetti. 1997. In Handbook of Statistics, edited by BT - Handbook of Statistics, 15:49–75. Robust Inference. Elsevier.
Maronna, Ricardo A., Douglas Martin, and Víctor J. Yohai. 2006. Robust statistics: theory and methods. Reprinted with corr. Wiley series in probability and statistics. Chichester: Wiley.
Maronna, Ricardo Antonio. 1976. The Annals of Statistics 4 (1): 51–67.
Maronna, Ricardo A., and Víctor J. Yohai. 1995. Journal of the American Statistical Association 90 (429): 330–41.
———. 2014. In Wiley StatsRef: Statistics Reference Online. John Wiley & Sons, Ltd.
Maronna, Ricardo A., and Ruben H. Zamar. 2002. Technometrics 44 (4): 307–17.
Massart, Desire L., Leonard Kaufman, Peter J. Rousseeuw, and Annick Leroy. 1986. Analytica Chimica Acta 187 (January): 171–79.
Mossel, Elchanan, Joe Neeman, and Allan Sly. 2013. arXiv:1311.4115 [Cs, Math], November.
———. 2016. The Annals of Applied Probability 26 (4): 2211–56.
Oja, Hannu. 1983. Statistics & Probability Letters 1 (6): 327–32.
Qian, Guoqi. 1996.
Qian, Guoqi, and R. K. Hans. 1996.
Qian, Guoqi, and Hans R. Künsch. 1998. Journal of Statistical Planning and Inference 75 (1): 91–116.
Ronchetti, E. 2000. In Data Segmentation and Model Selection for Computer Vision, edited by Alireza Bab-Hadiashar and David Suter, 31–40. Springer New York.
Ronchetti, Elvezio. 1985. Statistics & Probability Letters 3 (1): 21–23.
———. 1997. Journal of Statistical Planning and Inference, Robust Statistics and Data Analysis, Part I, 57 (1): 59–72.
Ronchetti, Elvezio, and Fabio Trojani. 2001. Journal of Econometrics 101 (1): 37–69.
Rousseeuw, Peter J. 1984. Journal of the American Statistical Association 79 (388): 871–80.
Rousseeuw, Peter J., and Annick M. Leroy. 1987. Robust Regression and Outlier Detection. Wiley Series in Probability and Mathematical Statistics. New York: Wiley.
Rousseeuw, P., and V. Yohai. 1984. In Robust and Nonlinear Time Series Analysis, edited by Jürgen Franke, Wolfgang Härdle, and Douglas Martin, 256–72. Lecture Notes in Statistics 26. Springer US.
Royall, Richard M. 1986. International Statistical Review / Revue Internationale de Statistique 54 (2): 221–26.
Stigler, Stephen M. 2010. The American Statistician 64 (4): 277–81.
Street, James O., Raymond J. Carroll, and David Ruppert. 1988. The American Statistician 42 (2): 152–54.
Tharmaratnam, Kukatharmini, and Gerda Claeskens. 2013. Statistics 47 (1): 216–35.
Theil, Henri. 1992. In Henri Theil’s Contributions to Economics and Econometrics, edited by Baldev Raj and Johan Koerts, 345–81. Advanced Studies in Theoretical and Applied Econometrics 23. Springer Netherlands.
Tsou, Tsung-Shan. 2006. Journal of Statistical Planning and Inference 136 (9): 3173–86.
Wedderburn, R. W. M. 1974. Biometrika 61 (3): 439–47.
Xu, H., C. Caramanis, and S. Mannor. 2010. IEEE Transactions on Information Theory 56 (7): 3561–74.
Yang, Tao, Colin M. Gallagher, and Christopher S. McMahan. 2019. Communications in Statistics - Theory and Methods 48 (5): 1092–1107.
Yang, Wenzhuo, and Huan Xu. 2013. In ICML (3), 585–93.

### No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.