Terminology note:
I mean robust statistics in the sense of Huber, which is, informally, *outlier* robustness.

There are also *robust* estimators in econometrics; then it means something about good behaviour under heteroskedastic and/or correlated error.
Robust *Bayes* means something about inference that is robust to the choice of prior (which could overlap but is a rather different emphasis).

Outlier robustness is AFAICT more-or-less a frequentist project.
Bayesian approaches seem to achieve robustness
largely by choosing heavy-tailed priors or heavy-tailed noise distributions where they might have chosen light-tailed ones,
e.g. Laplacian distributions instead of Gaussian ones.
Such heavy-tailed distributions may have arbitrary prior parameters,
but not *more arbitrary than usual*
in Bayesian statistics and therefore do not attract
so much need to wash away the guilt as frequentists seem to feel.

One can off course use heavy-tailed noise distributions in frequentist inference as well and that will buy a kind of robustness. That seems to be unpopular due to making frequentist inference as difficult as Bayesian inference.

## Corruption models

- Random (mixture) corruption
- (Adversarial) total variation \(\epsilon\)-corruption.
- wasserstein corruption models (does one usually assume adversarial here or random) as seen in βdistributionally robustβ models.
- other?

## M-estimation with robust loss

The one that I, at least, would think of when considering robust estimation.

In M-estimation, instead of hunting an maximum of the likelihood function as you do in maximum likelihood, or a minimum of the sum of squared residuals, as you do in least-squares estimation, you minimise a specifically chosen loss function for those residuals. You may select an objective function more robust to deviations between your model and reality. Credited to Huber (1964).

See M-estimation for some details.

AFAICT, the definition of M-estimation includes the possibility that you
*could* in principle select a *less*-robust loss function than least sum-of-squares
but I have not seen this in the literature.
Generally, some robustified approach is presumed, which penalises outliers less severly than least-squares.

For M-estimation as robust estimation, various complications ensue, such as the different between noise in your predictors, noise in your regressors, and whether the βtrueβ model is included in your class, and which of these difficulties you have resolved or not.

Loosely speaking, no, you havenβt solved problems of noise in your predictors, only the problem of noise in your responses.

And the cost is that you now have a loss function with some extra arbitrary parameters in which you have to justify, which is anathema to frequentists, who like to claim to be less arbitrary than Bayesians.

### Huber loss

### Tukey loss

## MM-estimation

π Donβt know

## Median-based estimators

Rousseeuw and Yohaiβs idea (P. Rousseeuw and Yohai 1984)

Many permutations on the theme here, but it rapidly gets complex. The only one of these families I have looked into are the near trivial cases of the Least Median Of Squares and Least Trimmed Squares estimations. (P. J. Rousseeuw 1984) More broadly we should also consider S-estimators, which do something withβ¦ robust estimation of scale and using this to do robust estimation of location? π

Theil-Sen-(Oja) estimators: Something about medians of inferred regression slopes. π

Tukey median, and why no-one uses it what with it being NP-Hard.

## Others

RANSAC β some kind of randomised outlier detection estimator. π

## Incoming

- relation to penalized regression.
- connection with Lasso.
- Beranβs Hellinger-ball contamination model, which I also donβt yet understand.
- Breakdown point explanation
- Yet Another Math Programming Consultant: Huber regression: different formulations

## References

*Biometrika*70 (2): 343β65.

*Zeitschrift FΓΌr Wahrscheinlichkeitstheorie Und Verwandte Gebiete*55 (1): 91β108.

*The Annals of Statistics*10 (2): 415β28.

*Journal of the American Statistical Association*70 (350): 428β34.

*Biometrics*66 (4): 1069β77.

*Selected Works of Peter J. Bickel*, edited by Jianqing Fan, Yaβacov Ritov, and C. F. Jeff Wu, 51β98. Selected Works in Probability and Statistics 13. Springer New York.

*Biometrika*82 (4): 877β86.

*Journal of the American Statistical Association*96 (455): 1022β30.

*arXiv:1611.02315 [Cs, Math, Stat]*, November.

*Biometrika*70 (1): 269β74.

*Biometrika*97 (3): 621β30.

*arXiv:1910.14139 [Cs]*, October.

*arXiv:1703.00893 [Cs, Math, Stat]*, March.

*arXiv:1604.06443 [Cs, Math, Stat]*, April.

*A Festschrift for Erich L. Lehmann*157184.

*The Annals of Statistics*16 (2): 552β86.

*arXiv:1310.7320 [Cs, Math, Stat]*, October.

*arXiv:1610.03425 [Stat]*, October.

*Journal of the American Statistical Association*98 (461): 67β76.

*arXiv:1611.05224 [Math, Stat]*, November.

*The Annals of Statistics*18 (2): 758β78.

*Journal of the American Statistical Association*69 (346): 383β93.

*Robust Statistics: The Approach Based on Influence Functions*. John Wiley & Sons.

*Communications in Statistics - Theory and Methods*6 (9): 813β27.

*Journal of Computational and Graphical Statistics*0 (0): 1β15.

*The Annals of Mathematical Statistics*35 (1): 73β101.

*Robust Statistics*. 2nd ed. Wiley Series in Probability and Statistics. Hoboken, N.J: Wiley.

*arXiv:1610.01353 [Math, Stat]*, October.

*Information Criteria and Statistical Modeling*. Springer Series in Statistics. New York: Springer.

*Biometrika*83 (4): 875β90.

*Journal of Statistical Planning and Inference*, C.R. Rao 80th Birthday Felicitation Volume, Part IV, 114 (1β2): 45β61.

*Proceedings of the National Academy of Sciences*110 (52): 20935β40.

*arXiv:1702.05860 [Cs]*, February.

*Biometrika*99 (3): 717β31.

*Econometric Theory*9 (03): 478β93.

*IEEE Transactions on Signal Processing*46 (9): 2431β47.

*Annual Review of Statistics and Its Application*8 (1): 301β27.

*Handbook of Statistics*, 15:49β75. Robust Inference. Elsevier.

*Robust statistics: theory and methods*. Reprinted with corr. Wiley series in probability and statistics. Chichester: Wiley.

*The Annals of Statistics*4 (1): 51β67.

*Journal of the American Statistical Association*90 (429): 330β41.

*Wiley StatsRef: Statistics Reference Online*. John Wiley & Sons, Ltd.

*Technometrics*44 (4): 307β17.

*Analytica Chimica Acta*187 (January): 171β79.

*arXiv:1311.4115 [Cs, Math]*, November.

*The Annals of Applied Probability*26 (4): 2211β56.

*Statistics & Probability Letters*1 (6): 327β32.

*arXiv:2107.02308 [Cs]*, July.

*Journal of Statistical Planning and Inference*75 (1): 91β116.

*Data Segmentation and Model Selection for Computer Vision*, edited by Alireza Bab-Hadiashar and David Suter, 31β40. Springer New York.

*Statistics & Probability Letters*3 (1): 21β23.

*Journal of Statistical Planning and Inference*, Robust Statistics and Data Analysis, Part I, 57 (1): 59β72.

*Journal of Econometrics*101 (1): 37β69.

*Journal of the American Statistical Association*79 (388): 871β80.

*Robust Regression and Outlier Detection*. Wiley Series in Probability and Mathematical Statistics. New York: Wiley.

*Robust and Nonlinear Time Series Analysis*, edited by JΓΌrgen Franke, Wolfgang HΓ€rdle, and Douglas Martin, 256β72. Lecture Notes in Statistics 26. Springer US.

*International Statistical Review / Revue Internationale de Statistique*54 (2): 221β26.

*The American Statistician*64 (4): 277β81.

*The American Statistician*42 (2): 152β54.

*Statistics*47 (1): 216β35.

*Henri Theilβs Contributions to Economics and Econometrics*, edited by Baldev Raj and Johan Koerts, 345β81. Advanced Studies in Theoretical and Applied Econometrics 23. Springer Netherlands.

*Journal of Statistical Planning and Inference*136 (9): 3173β86.

*Biometrika*61 (3): 439β47.

*IEEE Transactions on Information Theory*56 (7): 3561β74.

*Communications in Statistics - Theory and Methods*48 (5): 1092β1107.

*ICML (3)*, 585β93.

## No comments yet. Why not leave one?