Problems with the p-value -- References

On December 11th, Prof. Regina Nuzzo from Galludet University talked at Data Science DC, about Problems with the p-value. The event was well-received. If you missed it, the slides and audio are available. Here we provide Dr. Nuzzo's references and links from the talk, which are on their own a great resource for those considering communication about statistical reliability. (Note that the five topics she covered used examples from highly-publicized studies of sexual behavior.)

First, Dr. Nuzzo's ASA-award-winning essay for Nature:

Nuzzo, R. "Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume." Nature 506 (2014): 150-2.

Then the 5 categories of problems, with references:

P-Dazzling & the Female Orgasm Nasal Booster

Glorifying the Noise & Red-Hot Dates

  • Elliot, A. J., & Niesta, D. (2008). Romantic red: red enhances men's attraction to women. Journal of personality and social psychology, 95(5), 1150.
  • Gelman, A., & Carlin, J. (2014). Beyond Power Calculations Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science,9(6), 641-651.
  • Ioannidis, John PA. "Why most discovered true associations are inflated." Epidemiology 19.5 (2008): 640-648.
  • Button, K. S., et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365-376.

Stacking the Deck & Online Sweethearts

  • Cacioppo, J. T, et al. (2013). Marital satisfaction and break-ups differ across on-line and off-line meeting venues. Proceedings of the National Academy of Sciences, 110(25), 10135-10140.
  • Sellke, Thomas, M. J. Bayarri, and James O. Berger. "Calibration of ρ values for testing precise null hypotheses." The American Statistician 55.1 (2001): 62-71.

  • Chasing Unicorns & Pornography ESP

  • Bem, D. J. (2011). Feeling the future: experimental evidence for anomalous retroactive influences on cognition and affect. Journal of personality and social psychology, 100(3), 407.

  • Wagenmakers, E. J., et al. (2011). Why psychologists must change the way they analyze their data: The case of psi. Journal of Personality and Social Psychology, 100(3), 426-432.
  • Rouder, J. N., & Morey, R. D. (2011). A Bayes factor meta-analysis of Bem’s ESP claim. Psychonomic Bulletin & Review, 18(4), 682-689.
  • Sellke, Thomas, M. J. Bayarri, and James O. Berger. "Calibration of ρ values for testing precise null hypotheses." The American Statistician 55.1 (2001): 62-71.
  • Held, L. (2010). A nomogram for P values. BMC medical research methodology,10(1), 21.

The Curse of the Multiverse & Matters of Size

  • Costa, R. M., Miller, G. F., & Brody, S. (2012). Women who prefer longer penises are more likely to have vaginal orgasms (but not clitoral orgasms): Implications for an evolutionary theory of vaginal orgasm. The journal of sexual medicine, 9(12), 3079-3088.
  • Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn. "False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant." Psychological science 22.11 (2011): 1359-1366.
  • Gelman, A., & Loken, E. (2014). The statistical crisis in science. American Scientist, 1-5.

Related Reading

  • Berger, James O., and Donald A. Berry. "Statistical analysis and the illusion of objectivity." American Scientist (1988): 159-165.
  • Cumming, Geoff. Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge, 2012.
  • Goodman, S. N. (1999). Toward evidence-based medical statistics. 1: The P value fallacy. Annals of internal medicine, 130(12), 995-1004.
  • Goodman, S. N. (1999). Toward evidence-based medical statistics. 2: The Bayes factor. Annals of internal medicine, 130(12), 1005-1013.
  • Goodman, S. (2008, July). A Dirty Dozen: Twelve< i> P-Value Misconceptions. In Seminars in hematology (Vol. 45, No. 3, pp. 135-140). WB Saunders.
  • Ioannidis, J. P. (2005). Why most published research findings are false. PLoS medicine, 2(8), e124.
  • Ioannidis JPA (2014) How to Make More Published Research True. PLoS Med 11(10): e1001747.
  • Mayo, D. (2004). "An Error-Statistical Philosophy of Evidence," in M. Taper and S. Lele (eds.) The Nature of Scientific Evidence: Statistical, Philosophical and Empirical Considerations. Chicago: University of Chicago Press: 79-118.
  • Royall, R. M. (1986). The effect of sample size on the meaning of significance tests. The American Statistician, 40(4), 313-315.

Some relevant web sites: