Survey modelling

Adjusting for the Lizardman constant



Placeholder page for information about surveys, their design, analysis and limitations.

When can I get any information at out out of surveys?

It is, in general, hard, to get information out of surveys. Finding research questions that we can answer with surveys is a challenge in itself, and, having done so, it is a whole specialty field to design surveys to actually get at the research question we want to answer. Typically surveys that I have been asked to look at have not put sufficient effort into that, or they put that effort in too late.

Survey Chicken is a good essay about the difficulties here:

What I mean by “surveys” is standard written (or spoken) instruments, composed mostly of language, that are administered to subjects, who give responses, and whose responses are treated as quantitative information, which may then be subjected to statistical analysis. It is not the case that knowledge can never be obtained in this manner. But the idea that there exists some survey, and some survey conditions, that might plausibly produce the knowledge claimed, tends to lead to a mental process of filling in the blanks, of giving the benefit of the doubt to surveys in the ordinary case. But, I think, the ordinary survey, in its ordinary conditions, is of no evidentiary value for any important claim.

There are a lot of problems that arise here. A famous one is response bias:

Image: Sketchplanations. Of course, thanks to the lizardman constant we know that it is more plausible that 4% of people would have responded ‘no I never answer surveys’.

But there are so many!

Another one that I am fond of using, because it has a catchy name, is the Lizardman constant; This is the problem that survey responses have an irreducible level of noisy nonsense. Specifically, a rule of thumb 4% of people will claim their head of state is an alien Lizard monster on a survey.

I am particularly exercised by the problem that I refer to as the Dunning Kruger Theory of Mind, which is that, even with the best intentions in the world and an unbounded survey budget, we are not good at knowing our own minds, and even worse at knowing the minds of others. With all the focus and intent in the world, my survey responses are far more reflective of my self-image than they are of any facts about the world.

OK, many caveats, warnings and qualifications here. Does that mean that surveys are useless? No, it does not. It just means that surveys are difficult and limited. But sometimes there is no other clear way to study the phenomenon of interest, so we have to do what we can. What follows are some tricks to do this.

Survey design

TBD. To pick a paper that I have been looking at recently, Gelman and Margalit (2021) is an example of a paper that does ingenious survey design to answer non-trivial questions.

Post stratification

Tricks of particular use in modeling survey data when you need to adjust for bias in who actually answers the survey. Reweighting the data to correct for various types of remediable sampling bias.

There is some interesting crossover with clinical trial theory, in that there are surprising things that you CAN learn from a biased sample in many circumstances

It is a commonly held belief that clinical trials, to provide treatment effects that are generalizable to a population, must use a sample that reflects that population’s characteristics. The confusion stems from the fact that if one were interested in estimating an average outcome for patients given treatment A, one would need a random sample from the target population. But clinical trials are not designed to estimate absolutes; they are designed to estimate differences as discussed further here. These differences, when measured on a scale for which treatment differences are allowed mathematically to be constant (e.g., difference in means, odds ratios, hazard ratios), show remarkable constancy as judged by a large number of published forest plots. What would make a treatment estimate (relative efficacy) not be transportable to another population? A requirement for non-generalizability is the existence of interactions with treatment such that the interacting factors have a distribution in the sample that is much different from the distribution in the population.

A related problem is the issue of overlap in observational studies. Researchers are taught that non-overlap makes observational treatment comparisons impossible. This is only true when the characteristic whose distributions don’t overlap between treatment groups interacts with treatment. The purpose of this article is to explore interactions in these contexts.

As a side note, if there is an interaction between treatment and a covariate, standard propensity score analysis will completely miss it.

This is a whole interesting thing in its own right; See post stratification for details.

Ordinal data

A particularly common data type to analyze in surveys — Ordinal models are how we usually get data from people. Think star ratings, or Likert scales.

sjplot is a handy package for exploratory plotting of Likert-type responses for social survey data. by Daniel Lüdecke.

Confounding and observational studies

Often survey data is further complicated by being about a natural experiment where we must deal with non-controlled trials. See Causal graphical models.

Graph sampling

Cannot pull people from the population at random? How about asking people you know and getting them to ask people they know? What can we learn from this approach? See inference on social graphs.

Data sets

Parsing SDA Pages

SDA is a suite of software developed at Berkeley for the web-based analysis of survey data. The Berkeley SDA archive lets you run various kinds of analyses on a number of public datasets, such as the General Social Survey. It also provides consistently-formatted HTML versions of the codebooks for the surveys it hosts. This is very convenient! For the gssr package, I wanted to include material from the codebooks as tibbles or data frames that would be accessible inside an R session. Processing the official codebook from its native PDF state into a data frame is, though technically possible, a rather off-putting prospect. But SDA has done most of the work already by making the pages available in HTML. I scraped the codebook pages from them instead. This post contains the code I used to do that.

References

Achlioptas, Dimitris, Aaron Clauset, David Kempe, and Cristopher Moore. 2005. “On the Bias of Traceroute Sampling: Or, Power-Law Degree Distributions in Regular Graphs.” In Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, 694–703. STOC ’05. New York, NY, USA: ACM. https://doi.org/10.1145/1060590.1060693.
Bareinboim, Elias, and Judea Pearl. 2016. “Causal Inference and the Data-Fusion Problem.” Proceedings of the National Academy of Sciences 113 (27): 7345–52. https://doi.org/10.1073/pnas.1510507113.
Bareinboim, Elias, Jin Tian, and Judea Pearl. 2014. “Recovering from Selection Bias in Causal and Statistical Inference.” In AAAI, 2410–16. http://ftp.cs.ucla.edu/pub/stat_ser/r425.pdf.
Bond, Robert M., Christopher J. Fariss, Jason J. Jones, Adam D. I. Kramer, Cameron Marlow, Jaime E. Settle, and James H. Fowler. 2012. “A 61-Million-Person Experiment in Social Influence and Political Mobilization.” Nature 489 (7415): 295–98. https://doi.org/10.1038/nature11421.
Broockman, David E., Joshua Kalla, and Jasjeet S. Sekhon. 2016. “The Design of Field Experiments With Survey Outcomes: A Framework for Selecting More Efficient, Robust, and Ethical Designs.” SSRN Scholarly Paper ID 2742869. Rochester, NY: Social Science Research Network. https://papers.ssrn.com/abstract=2742869.
Gao, Yuxiang, Lauren Kennedy, Daniel Simpson, and Andrew Gelman. 2019. “Improving Multilevel Regression and Poststratification with Structured Priors.” arXiv:1908.06716 [stat], August. http://arxiv.org/abs/1908.06716.
Gelman, Andrew. 2007. “Struggles with Survey Weighting and Regression Modeling.” Statistical Science 22 (2): 153–64. https://doi.org/10.1214/088342306000000691.
Gelman, Andrew, and John B. Carlin. 2000. “Poststratification and Weighting Adjustments.” In In. Wiley. http://www.stat.columbia.edu/~gelman/research/published/handbook5.pdf.
Gelman, Andrew, and Yotam Margalit. 2021. “Social Penumbras Predict Political Attitudes.” Proceedings of the National Academy of Sciences 118 (6). https://doi.org/10.1073/pnas.2019375118.
Ghitza, Yair, and Andrew Gelman. 2013. “Deep Interactions with MRP: Election Turnout and Voting Patterns Among Small Electoral Subgroups.” American Journal of Political Science 57 (3): 762–76. https://doi.org/10.1111/ajps.12004.
Hart, Einav, Eric VanEpps, and Maurice E. Schweitzer. 2019. “I Didn’t Want to Offend You: The Cost of Avoiding Sensitive Questions.” SSRN Scholarly Paper ID 3437468. Rochester, NY: Social Science Research Network. https://papers.ssrn.com/abstract=3437468.
Kennedy, Edward H., Jacqueline A. Mauro, Michael J. Daniels, Natalie Burns, and Dylan S. Small. 2019. “Handling Missing Data in Instrumental Variable Methods for Causal Inference.” Annual Review of Statistics and Its Application 6 (1): 125–48. https://doi.org/10.1146/annurev-statistics-031017-100353.
Kohler, Ulrich, Frauke Kreuter, and Elizabeth A. Stuart. 2019. “Nonprobability Sampling and Causal Analysis.” Annual Review of Statistics and Its Application 6 (1): 149–72. https://doi.org/10.1146/annurev-statistics-030718-104951.
Kong, Yuqing. 2019. “Dominantly Truthful Multi-Task Peer Prediction with a Constant Number of Tasks.” arXiv:1911.00272 [cs, Econ], November. http://arxiv.org/abs/1911.00272.
Krivitsky, Pavel N., and Martina Morris. 2017. “Inference For Social Network Models From Egocentrically Sampled Data, With Application To Understanding Persistent Racial Disparities In Hiv Prevalence In The Us.” The Annals of Applied Statistics 11 (1): 427–55. https://doi.org/10.1214/16-AOAS1010.
Lerman, Kristina. 2017. “Computational Social Scientist Beware: Simpson’s Paradox in Behavioral Data.” arXiv:1710.08615 [physics], October. http://arxiv.org/abs/1710.08615.
Little, R. J. A. 1993. “Post-Stratification: A Modeler’s Perspective.” Journal of the American Statistical Association 88 (423): 1001–12. https://doi.org/10.1080/01621459.1993.10476368.
Little, Roderick JA. 1991. “Inference with Survey Weights.” Journal of Official Statistics 7 (4): 405.
Prelec, Dražen, H. Sebastian Seung, and John McCoy. 2017. “A Solution to the Single-Question Crowd Wisdom Problem.” Nature 541 (7638): 532–35. https://doi.org/10.1038/nature21054.
Rubin, Donald B, and Richard P Waterman. 2006. “Estimating the Causal Effects of Marketing Interventions Using Propensity Score Methodology.” Statistical Science 21 (2): 206–22. https://doi.org/10.1214/088342306000000259.
Sanguiao Sande, Luis, and Li-Chun Zhang. 2020. “Design-Unbiased Statistical Learning in Survey Sampling.” Sankhya: The Indian Journal of Statistics, October. https://doi.org/10.1007/s13171-020-00224-1.
Shalizi, Cosma Rohilla, and Edward McFowland III. 2016. “Controlling for Latent Homophily in Social Networks Through Inferring Latent Locations.” arXiv:1607.06565 [physics, Stat], July. http://arxiv.org/abs/1607.06565.
Shalizi, Cosma Rohilla, and Andrew C. Thomas. 2011. “Homophily and Contagion Are Generically Confounded in Observational Social Network Studies.” Sociological Methods & Research 40 (2): 211–39. https://doi.org/10.1177/0049124111404820.
Yadav, Pranjul, Lisiane Prunelli, Alexander Hoff, Michael Steinbach, Bonnie Westra, Vipin Kumar, and Gyorgy Simon. 2016. “Causal Inference in Observational Data.” arXiv:1611.04660 [cs, Stat], November. http://arxiv.org/abs/1611.04660.
Zhang, Li-Chun, and Nancy Nguyen. 2020. “An Appraisal of Common Reweighting Methods for Nonresponse in Household Surveys Based on Norwegian Labour Force Survey and Statistics on Income and Living Conditions Survey.” Journal of Official Statistics 36 (1): 151–72. https://doi.org/10.2478/JOS-2020-0008.

No comments yet. Why not leave one?

GitHub-flavored Markdown & a sane subset of HTML is supported.