On December 11th, Prof. Regina Nuzzo from Galludet University talked at Data Science DC, about Problems with the p-value. The event was well-received. If you missed it, the slides and audio are available. Here we provide Dr. Nuzzo's references and links from the talk, which are on their own a great resource for those considering communication about statistical reliability. (Note that the five topics she covered used examples from highly-publicized studies of sexual behavior.)
First, Dr. Nuzzo's ASA-award-winning essay for Nature:
Then the 5 categories of problems, with references:
P-Dazzling & the Female Orgasm Nasal Booster
Glorifying the Noise & Red-Hot Dates
- Elliot, A. J., & Niesta, D. (2008). Romantic red: red enhances men's attraction to women. Journal of personality and social psychology, 95(5), 1150.
- Gelman, A., & Carlin, J. (2014). Beyond Power Calculations Assessing Type S (Sign) and Type M (Magnitude) Errors. Perspectives on Psychological Science,9(6), 641-651.
- Ioannidis, John PA. "Why most discovered true associations are inflated." Epidemiology 19.5 (2008): 640-648.
- Button, K. S., et al. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14(5), 365-376.
Stacking the Deck & Online Sweethearts
- Cacioppo, J. T, et al. (2013). Marital satisfaction and break-ups differ across on-line and off-line meeting venues. Proceedings of the National Academy of Sciences, 110(25), 10135-10140.
Sellke, Thomas, M. J. Bayarri, and James O. Berger. "Calibration of ρ values for testing precise null hypotheses." The American Statistician 55.1 (2001): 62-71.
Chasing Unicorns & Pornography ESP
Bem, D. J. (2011). Feeling the future: experimental evidence for anomalous retroactive influences on cognition and affect. Journal of personality and social psychology, 100(3), 407.
- Wagenmakers, E. J., et al. (2011). Why psychologists must change the way they analyze their data: The case of psi. Journal of Personality and Social Psychology, 100(3), 426-432.
- Rouder, J. N., & Morey, R. D. (2011). A Bayes factor meta-analysis of Bem’s ESP claim. Psychonomic Bulletin & Review, 18(4), 682-689.
- Sellke, Thomas, M. J. Bayarri, and James O. Berger. "Calibration of ρ values for testing precise null hypotheses." The American Statistician 55.1 (2001): 62-71.
- Held, L. (2010). A nomogram for P values. BMC medical research methodology,10(1), 21.
The Curse of the Multiverse & Matters of Size
- Costa, R. M., Miller, G. F., & Brody, S. (2012). Women who prefer longer penises are more likely to have vaginal orgasms (but not clitoral orgasms): Implications for an evolutionary theory of vaginal orgasm. The journal of sexual medicine, 9(12), 3079-3088.
- Simmons, Joseph P., Leif D. Nelson, and Uri Simonsohn. "False-positive psychology undisclosed flexibility in data collection and analysis allows presenting anything as significant." Psychological science 22.11 (2011): 1359-1366.
- Gelman, A., & Loken, E. (2014). The statistical crisis in science. American Scientist, 1-5.
- Berger, James O., and Donald A. Berry. "Statistical analysis and the illusion of objectivity." American Scientist (1988): 159-165.
- Cumming, Geoff. Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. Routledge, 2012.
- Goodman, S. N. (1999). Toward evidence-based medical statistics. 1: The P value fallacy. Annals of internal medicine, 130(12), 995-1004.
- Goodman, S. N. (1999). Toward evidence-based medical statistics. 2: The Bayes factor. Annals of internal medicine, 130(12), 1005-1013.
- Goodman, S. (2008, July). A Dirty Dozen: Twelve< i> P-Value Misconceptions. In Seminars in hematology (Vol. 45, No. 3, pp. 135-140). WB Saunders.
- Ioannidis, J. P. (2005). Why most published research findings are false. PLoS medicine, 2(8), e124.
- Ioannidis JPA (2014) How to Make More Published Research True. PLoS Med 11(10): e1001747.
- Mayo, D. (2004). "An Error-Statistical Philosophy of Evidence," in M. Taper and S. Lele (eds.) The Nature of Scientific Evidence: Statistical, Philosophical and Empirical Considerations. Chicago: University of Chicago Press: 79-118.
- Royall, R. M. (1986). The effect of sample size on the meaning of significance tests. The American Statistician, 40(4), 313-315.