Tamaño de efecto, potencia de la prueba, factor de Bayes y meta-análisis en el marco de la crisis de reproducibilidad de la ciencia. El caso de la diferencia de medias -con muestras independientes- (primera parte)

Luis D’Angelo

doi:10.56503/CIMBAGE/Vol.1/Nro.23(2021)p.47-82

Luis D’Angelo Facultad de Farmacia y Bioquímica. Universidad de Buenos Aires

DOI: https://doi.org/10.56503/CIMBAGE/Vol.1/Nro.23(2021)p.47-82

Palabras clave: TAMAÑO DEL EFECTO, POTENCIA DE LA PRUEBA, FACTOR DE BAYES, META-ANÁLISIS

Resumen

En este trabajo se presenta una interpelación a lo que desde el siglo pasado ha sido para el mundo de las ciencias un elemento inapelable como es “la prueba de hipótesis”.

La propuesta es justamente presentar una serie de problemas y soluciones a la cuestión de la prueba de hipótesis específicamente en el caso de la diferencia de medias en muestras independientes. Para ello nos concentraremos en abordar cuatro temas centrales que permitirán ofrecer alternativas prácticas que creemos podrán ser de utilidad para los investigadores cuando se topen con la necesidad de dar cuenta de la veracidad de sus trabajos. Estos son: el tamaño del efecto (muy en particular); la potencia de la prueba; la medida de creencia en la hipótesis nula y alternativa: factor de Bayes; el meta-análisis

Seguramente todos estos temas juntos en un artículo parecen realmente demasiado. Pero justamente este es el desafío de este trabajo. Porque cada una de estas técnicas han ofrecido una respuesta a un problema puntual. Y el/la investigador/a en su trabajo se va topando con todos los problemas juntos y debe responder con un arsenal de técnicas con el que muchas veces no cuenta. Esperamos entonces que este trabajo contribuya a encontrar las herramientas necesarias para resolver esos problemas a la luz de la crisis de credibilidad reinante en las ciencias.

Descargas

La descarga de datos todavía no está disponible.

Citas

AERA. (2006). Standards for reporting on empirical social science research in AERA publications. Educational Researcher, 36(6), 33-40.

Agresti, A. (1980). Generalized Odds Ratios for Ordinal Data. International Biometric Society, 36(1), 59-67.

Algina, J., Keselman, H. J., & Penfield, R. D. (2005). An Alternative to Cohen's Standarized Mean Differencie Effect Size: A Robust Parameter and Confidence Interval in the Two Independent Groups Case. Psychological Methods, 10(3).

APA. (2007). Manual of the American Psychological Association (APA) (Sixth ed.). Washington. DC.

Aromataris, E., & Munn Z. (Editors). (2017). Joanna Briggs Institute Reviewer's Manual. Retrieved from The Joanna Briggs Institute: https://reviewersmanual.joannabriggs.org

Blalock, H. M. (1994). Estadística social. México: Fondo de cultura económica.

Bolstad, W. M. (2007). Introduction to Bayesian Statistics. New Jersey: John Wiley & Sons, Inc.

Borenstein, M., Hedges, L., Higgins, J., & Rothstein, H. (2009). Introduction to Meta-Analysis. West Sussex: John Wiley & Sons, Ltd.

Brand, A., Bradley, M. T., Best , L. A., & Stoica, G. (2008). Accuracy of Effect Size Estimates from Published Psychological Research. Perceptual and Motor Skills, 106, 645-649.

Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. (2011). Retrieved from https://handbook-5-1.cochrane.org/chapter_9/9_5_2_identifying_and_measuring_heterogeneity.htm

Coe, R., & Merino Soto, C. (2003). Magnitud del Efecto: Una guía para investigadores y usuarios. Revista de Psicología de la PUCP. Vol. XXI,, XXI(1).

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (Second ed.). New York: Lawrence Erlbaum Associates, Publishers.

Cohen, J. (1992). A Power Primer. Psychological Bulletin, 112(1).

Cohen, J. (1994). The Earth Is Round (p < .05). American Psychological Association, 49(12), 997-1003.

Cooper, H., Hedges, L. V., & Valentine, J. C. (2019). The Handbook of Research Synthesis and Meta-Analysis (3rd. ed.). New York: Russell Sage Foundation.

Cousineau, D., & Laurencelle, L. (2011). Non-central t distribution and the power of the t test: A rejoinder. Tutorials in Quantitative Methods for Psychology, 7(1).

Cumming, G. (2014). The New Statistics: Why and How. Psychological Science (Sage), 25(1).

Cumming, T. B., Churilov, L., & Sena, E. S. (2015). The Missing Medians: Exclusion of Ordinal Data from Meta-Analysis. Plos One.

Ellis, P. D. (2010). The Essencial guide to Effect Size. Cambridge: University Press.

Faulkenberry, T. J. (2018). Computing Bayes factors to measure evidence from experiments: An extension of the BIC approximation. Biometrical Letters.

Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics. London: Sage Publications Ltd.

Garret, H. E. (1983). Estadística en psicología y educación. Buenos Aires: Paidos.

Glass , G. V. (1966). Note on rank-biserial correlation. Educational and Phychological measurement, 26, 623-631.

Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research. Newbury Park: Sage Publications.

Grissom, R. J., & Kim, J. J. (2012). Effect Sizes for Research: Univariate and Multivariate Application. New York: Routkedge.

Gronau, Q. F., Ly, A., & Wagenmakers, E.-J. (2018). Informed Bayesian T-Tests. Retrieved from https://arxiv.org/abs/1704.02479

Hedges, L. V. (1981). Distribution Theory for Glass’s Estimator of Effect Size and Related Estimators. Journal of Educational Statistics, 6.

Higgins J. P. T., G. S. (2011). Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0 [updated March 2011]. Retrieved from The Cochrane Collaboration: www.handbook.cochrane.org

Hoekstra, R., Monden, R., van Ravenzwaaij, D., & Wagenmakers, E.-J. (2018). Bayesian reanalysis of null results reported in medicine: Strong yet variable evidence for the absence of treatment effects. PLoS ONE , 13(4), https://doi.org/10.1371/journal.pone.0195474.

Hoekstra, R., Monden, R., van Ravenzwaaij, D., & Wagenmakers, E.-J. (2018). Bayesian reanalysis of null results reported in medicine: Strong yet variable evidence for the absence of treatment effects. Plos One.

Hozo, S. P., Djulbegovic, B., & Hozo, I. (2005). Estimating the mean and variance from the median, range, and the size of a sample. BMC Medical Research Methodology, 5(13).

Introduction to New Statistics. (2019, 10). Retrieved from https://thenewstatistics.com/itns/

Ioannidis, J. P. (2005). Why Most Published Research Findings are False. PLoS Medicine, 2(8).

Ioannidis, J. P. (2014). How to make more published research true. PLoS Med, 11(10).

Ioannidis, J. P. (2016). Why Most Clinical Research Is Not Useful. PLoS Med, 13(6).

Iraurgi, I. (2009). Evaluación de resultados clínicos (II): Las medidas de la significación clínica o los tamaños del efecto. NORTE DE SALUD MENTAL(34), 94–110.

Iyengar, S., & Greenhouse, J. B. (1988). Selection Models and de File Drawer Problem. Statisticasl Science, 3(1).

Jeffreys, H. (1961). Theory of probability (3rd. ed.). New York, NY: Oxford University Press.

Juárez Hernández, B., Sotres Ramos, D. A., & Matuszewski, A. (2001). Distribución exacta de la estadística prueba tipo Mann-Whitney-Wilcoxon bajo violaciones a los supuestos estándar, para distribuciones uniformes continuas. Agrociencia, 35(2), 223-235.

Kerby, D. S. (2014). The simple difference formula: an approach to teaching nonparametric correlation. Comprehensive Psycology, 3(1).

Macbeth, G., Cortada de Kohan, N., & Razumiejczyk, E. (2007). El Meta-Análisis: La Integración de los Resultados Científicos. Evaluar, 7.

Marsman, M., & Wagenmakers, E.-J. (2017). Bayesian benefits with JASP. European Journal of Developmental Psycholog, 14(5), 545-555.

Meng-Yun , L. (2013). Bayesian Statistics. https://www.bu.edu/sph/files/2014/05/Bayesian-Statistics_final_20140416.pdf. Boston University School of Public Health.

Morales Vallejo, P. (2012, Octubre 3). El tamaño del efecto (effect size):análisis complementarios al contraste de medias. Retrieved 2019, from https://web.upcomillas.es/personal/peter/investigacion/Tama%f1oDelEfecto.pdf

Morey, R. D., & Rouder, J. N. (2011). Bayes Factor Approaches for Testing Interval Null Hypotheses. Psychological Methods.

Nakagawa, S., & Cuthill, I. C. (2007). Effect size, confidence interval and statistical significance: a practical guide for biologists. Biological Reviews, 82, 591-605.

Nunnaly , J. (1960). The Place of Statistics in Psychology. Educational and Psychological Measurement, XX(4).

Pardo, A., & San Martín, R. (1994). Análisis de datos en Psicología II. Madrid: Pirámide.

Pértegas Díaz, S., & Pita Fernández, S. (2003). Cálculo del poder estadístico de un estudio. Cad Aten Primaria, https://www.fisterra.com/mbe/investiga/poder_estadistico/poder_estadistico.asp, 59-63.

Quintana, D. S., & Williams, D. R. (2018). Quintana,Bayesian alternatives for common null-hypothesis significance tests in psychiatry: a non-technical guide using JASP. BMC Psychiatry, https://doi.org/10.1186/s12888-018-1761-4.

Rouder, J. N., Speckman , P. L., Dongchu , S., & Iverson, G. (2009). Bayesian t tests for accepting and rejecting the null hypothesis. Psychonomic Bulletin & Review, 16(2), Psychonomic Bulletin & Review.

Sawilowsky, S. S. (2009). New effect size rules of thumb. Journal of Modern Applied Statistical Methods, 8(2), 597 – 599.

SEH-LELHA. (2003). Heterogeneidad entre los estudios incluidos en un meta-análisis. Retrieved from Liga española para la lucha contra la hipertensión arterial: https://www.seh-lelha.org/heterogeneidad-los-estudios-incluidos-meta-analisis/

Wan, X., Wang, W., Liu, J., & Tong , T. (2014). Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMCMedical Research Methodology, 14(135).

Wilcox, R. (2018). A robust nonparametric measure of effect size based on an analog of Cohen's d, plus inferences about the median of the typical difference. Journal of Modern Applied Statistical Methods, 17(2).

Wilson, D. B. (2020, 4 11). Campbell Collaboration. Retrieved from Practical Meta-Analysis Effect Size Calculator: https://campbellcollaboration.org/research-resources/effect-size-calculator.html

Wuensch's SPSS Links Page. (2019, 10). Retrieved from http://core.ecu.edu/psyc/wuenschk/SPSS/SPSS-Programs.htm

Ziliak, S., & McCloskey, D. N. (2008). The Cult of Statistical Significance. How the Standard Error Costs Us Jobs, Justice, and Lives. University of Michigan Press - Ann Arbor.