# Multiple testing

April 22, 2015 — November 5, 2018

How to go data mining for models without “dredging” for models. (accidentally or otherwise) If you keep on testing models until you find some that fit (which you usually will) how do you know that the fit is in some sense *interesting*? How sharp will your conclusions be? How does it work when you are testing against a possibly uncountable continuum of hypotheses? (One perspective on sparsity penalties is precisely this, I am told.)

Model selection is this writ small - when you are testing how many variables to include in your model.

In modern high-dimensional models, where you have potentially many explanatory variables, the question of how to handle the combinatorial explosion of possible variables to include, this can also be considered a multiple testing problem. We tend to regard this as a smoothing and model selection problem though.

This all gets more complicated when you think about many people testing many hypothesese in many different experiments then you are going to run into many more issues than just these - also publication bias and suchlike.

Suggestive connection:

Moritz Hardt, The machine learning leaderboard problem:

In this post, I will describe a method to climb the public leaderboard without even looking at the data. The algorithm is so simple and natural that an unwitting analyst might just run it. We will see that in Kaggle’s famous Heritage Health Prize competition this might have propelled a participant from rank around 150 into the top 10 on the public leaderboard without making progress on the actual problem. […]

I get super excited. I keep climbing the leaderboard! Who would’ve thought that this machine learning thing was so easy? So, I go write a blog post on Medium about Big Data and score a job at DeepCompeting.ly, the latest data science startup in the city. Life is pretty sweet. I pick up indoor rock climbing, sign up for wood working classes; I read Proust and books about espresso. Two months later the competition closes and Kaggle releases the final score. What an embarrassment! Wacky boosting did nothing whatsoever on the final test set. I get fired from DeepCompeting.ly days before the buyout. My spouse dumps me. The lease expires. I get evicted from my apartment in the Mission. Inevitably, I hike the Pacific Crest Trail and write a novel about it.

See (Blum and Hardt 2015; Dwork et al. 2015a) for more of that.

## 1 P-value hacking

I Fooled Millions Into Thinking Chocolate Helps Weight Loss. Here’s How - also the journalism problem, the journals problem, the vacuous-fluff-that-passes-for-public-discussion problem…

## 2 False discovery rate

FDR control…

- Testing Millions of Hypotheses is Larry Wasserman’s introduction to controlling the
*false discovery rate*. See also Screening and the false discovery rate. The man can explain clearly.

## 3 Familywise error rate

Šidák correction, Bonferroni correction…

## 4 Post selection inference

## 5 Incoming

David Kadavy’s classic, grumpy essay A/A Testing: How I increased conversions 300% by doing absolutely nothing.

http://businessofsoftware.org/2013/06/jason-cohen-ceo-wp-engine-why-data-can-make-you-do-the-wrong-thing/ http://www.evanmiller.org/the-low-base-rate-problem.html

Multiple testing in python: multipy

## 6 References

*The Annals of Statistics*.

*American Journal of Public Health*.

*The Annals of Statistics*.

*The R Journal*.

*arXiv:0901.3202 [Cs, Stat]*.

*The Annals of Statistics*.

*Computational Statistics & Data Analysis*.

*arXiv:1511.02513 [Cs]*.

*Biometrical Journal*.

*The Annals of Applied Statistics*.

*Journal of the Royal Statistical Society: Series B (Methodological)*.

*Journal of the American Statistical Association*.

*The Annals of Statistics*.

*arXiv:1502.04585 [Cs]*.

*Biometrics*.

*arXiv:1503.06426 [Stat]*.

*The Annals of Statistics*.

*Sociological Methods & Research*.

*Annual Review of Economics*.

*arXiv Preprint arXiv:1610.02351*.

*IEEE Transactions on Information Theory*.

*Journal of Fourier Analysis and Applications*.

*Statistics & Probability Letters*.

*Journal of Statistical Planning and Inference*.

*Annual Review of Economics*.

*Biometrika*.

*Journal of the American Statistical Association*.

*Journal of the American Statistical Association*.

*Proceedings of the National Academy of Sciences*.

*Journal of Neuroscience*.

*BMC Bioinformatics*.

*arXiv Preprint arXiv:1602.03589*.

*Asymptotic Theory of Statistics and Probability*. Springer Texts in Statistics.

*The Annals of Statistics*.

*arXiv:1408.4026 [Stat]*.

*Journal of the American Statistical Association*.

*Journal of the Royal Statistical Society. Series B (Methodological)*.

*Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing - STOC ’15*.

*Science*.

*International Journal of Environmental Research and Public Health*.

*The Annals of Statistics*.

*Journal of the American Statistical Association*.

*Journal of the American Statistical Association*.

*Metron - International Journal of Statistics*.

*The Annals of Applied Statistics*.

*Journal of the American Statistical Association*.

*Statistical Science*.

*Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction*.

*arXiv:1507.05315 [Math, Stat]*.

*Journal of the American Statistical Association*.

*Statistica Sinica*.

*arXiv:1407.4240 [q-Bio, Stat]*.

*Journal of Statistical Software*.

*Advances in Neural Information Processing Systems 27*.

*American Scientist*.

*The Annals of Statistics*.

*Journal of Econometrics*.

*Proceedings of the 2014 IEEE 55th Annual Symposium on Foundations of Computer Science*. FOCS ’14.

*Statistics Surveys*.

*International Statistical Review / Revue Internationale de Statistique*.

*The Annals of Statistics*.

*Survival Analysis: State of the Art*. Nato Science 211.

*Biometrika*.

*Journal of Econometrics*.

*PLoS Medicine*.

*Statistical Science*.

*arXiv:1507.02061 [Math, Stat]*.

*Biometrika*.

*Biometrika*.

*Biometrika*.

*Neural Computation*.

*Genome Biology*.

*Journal of Applied Probability*.

*Physical Review X*.

*Electronic Journal of Statistics*.

*Molecular Psychiatry*.

*arXiv:1311.6238 [Math, Stat]*.

*The Annals of Statistics*.

*The Annals of Statistics*.

*Scandinavian Journal of Statistics*.

*Computational Statistics & Data Analysis*.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*.

*Biometrika*.

*The Annals of Statistics*.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*.

*Journal of the American Statistical Association*.

*The Annals of Statistics*.

*The Annals of Statistics*.

*Journal of Machine Learning Research*.

*The Annals of Statistics*.

*Nature Biotechnology*.

*International Journal of Data Science and Analytics*.

*The Annals of Statistics*.

*Epidemiology (Cambridge, Mass.)*.

*Proceedings of the National Academy of Sciences*.

*arXiv:1411.1437 [Math, Stat]*.

*Journal of the Royal Statistical Society. Series B (Methodological)*.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*.

*arXiv:1511.01957 [Cs, Math, Stat]*.

*arXiv:1308.5623 [Stat]*.

*arXiv:1411.6144 [Stat]*.

*Journal of Machine Learning Research*.

*arXiv:1401.3889 [Stat]*.

*arXiv:1408.5801 [Stat]*.

*arXiv:1506.06266 [Math, Stat]*.

*The Annals of Statistics*.

*arXiv:1107.0189 [Stat]*.

*Annals of Statistics*.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*.

*The Annals of Statistics*.