# Publication bias

## Replication crises, P-values, bitching about journals, other debilities of contemporary science at large

We’re out here everyday, doing the dirty work finding noise and then polishing it into the hypotheses everyone loves. It’s not easy. —John Schmidt, The noise miners

Multiple testing across a whole scientific field, with a side helping of uneven data release.

On one hand we hope that journals will help us find things that are relevant. On the other hand, we would hope the things they help us find are actually true. It’s not at all obvious how to solve these kind of classification problems economically.

This is a complicated issue with which I won’t engage deeply, but I do want to note some things for future reference. Perhaps you’d prefer to see an actual well-documented exemplar train-wreck in progress in social psychology?

Keywords: “file-drawer process” and the “publication sieve”, which are the large-scale models of how this works in a scientific community and “researcher degrees of freedom” which is the model for how this works at the individual scale.

This is particularly pertinent in social psychology, where it turns out the there is too much bullshit with $$P\leq 0.05$$.

Sanjay Srivastava, Everything is fucked, the syllabus

## Fixing P-hacking

Oliver Traldi reviews Stuart Ritchie’s book Science Fictions which uses the replication crisis in psychology as a lense to understand science’s flaws.

Serious though this is, there is also something more specifically pernicious about the replication crisis in psychology. We saw that the bias in psychological research is in favour of publishing exciting results. An exciting result in psychology is one that tells us that something has a large effect on people’s behavior. And the things that the studies that have failed to replicate have found to have large effects on people’s behavior are not necessarily things that ought to affect people’s behaviour, were those people rational. Think of the studies I mentioned above: a mess makes people more prejudiced; a random assignment of roles makes people sadistic; a list of words makes people walk at a different speed; a strange pose makes people more confident. And so on.

Uri Simonsohn’s article on detecting the signature of p-hacking is interesting.

Some might say, just fix the incentives, but apparently that is off the table because it would require political energy. There is open notebook science, that could be a thing.

Failing the budget for that… pre-registration?

Tom Staffords 2 minute guide to experiment pre-registation:

Pre-registration is easy. There is no single, universally accepted, way to do it.

• you could write your data collection and analysis plan down and post it on your blog.

• you can use the Open Science Framework to timestamp and archive a pre-registration, so you can prove you made a prediction ahead of time.

• you can visit AsPredicted.org which provides a form to complete, which will help you structure your pre-registration (making sure you include all relevant information).

• Registered Reports”: more and more journals are committing to published pre-registered studies. They review the method and analysis plan before data collection and agree to publish once the results are in (however they turn out).

But should you in fact pre-register?

Morally perhaps. But not in the sense that it’s something that you should do if you want to progress in your career. Rather the opposite. As argued by Ed Hagen, academic success is either a crapshoot or a scam:

The problem, in a nut shell, is that empirical researchers have placed the fates of their careers in the hands of nature instead of themselves. […]

Academic success for empirical researchers is largely determined by a count of one’s publications, and the prestige of the journals in which those publications appear […]

the minimum acceptable number of pubs per year for a researcher with aspirations for tenure and promotion is about three. This means that, each year, I must discover three important new things about the world. […] […]

Let’s say I choose to run 3 studies that each has a 50% chance of getting a sexy result. If I run 3 great studies, mother nature will reward me with 3 sexy results only 12.5% of the time. I would have to run 9 studies to have about a 90% chance that at least 3 would be sexy enough to publish in a prestigious journal.

I do not have the time or money to run 9 new studies every year.

Gabry, Jonah, Daniel Simpson, Aki Vehtari, Michael Betancourt, and Andrew Gelman. 2019. “Visualization in Bayesian Workflow.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 182 (2): 389–402. https://doi.org/10.1111/rssa.12378.

Gelman, Andrew, and Cosma Rohilla Shalizi. 2013. “Philosophy and the Practice of Bayesian Statistics.” British Journal of Mathematical and Statistical Psychology 66 (1): 8–38. https://doi.org/10.1111/j.2044-8317.2011.02037.x.

McShane, Blakeley B., David Gal, Andrew Gelman, Christian Robert, and Jennifer L. Tackett. 2019. “Abandon Statistical Significance.” The American Statistician 73 (sup1): 235–45. https://doi.org/10.1080/00031305.2018.1527253.

Nissen, Silas B., Tali Magidson, Kevin Gross, and Carl T. Bergstrom. 2016. “Publication Bias and the Canonization of False Facts,” September. http://arxiv.org/abs/1609.00494.

Ritchie, Stuart. 2020. Science Fictions: How Fraud, Bias, Negligence, and Hype Undermine the Search for Truth. First edition. New York: Metropolitan Books ; Henry Holt and Company.

Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn. 2011. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological Science 22 (11): 1359–66. https://doi.org/10.1177/0956797611417632.