Abstract
In a world awash with data and computers, it is tempting to automate the process of scientific discovery by performing comparisons between many pairs of variables in the hope of finding correlations. When frequentist hypothesis tests are performed at a fixed confidence level, increasing the number of tests increases the probability of observing a "statistically significant" result, even when the null hypothesis is actually true. Carefully-designed tests, such as Tukey's honestly significant difference (HSD) test (Tukey, 1949), protect against this practice of "p-hacking" by producing p-values and confidence intervals that account for the number of comparisons performed. Several such tests rely on the studentized range distribution (Lund & Lund, 1983), which models the range (i.e., the difference between maximum and minimum values) of the means of samples from a normally distributed population. Although there are already implementations of these tests available in the scientific Python ecosystem, all of them rely on approximations of the studentized range distribution, which are not accurate outside the range of inputs for which they are designed. Here we present a very accurate and sufficiently fast implementation of the studentized range distribution and a function for performing Tukey's HSD test. Both of these features are available in SciPy 1.8.0. Performance The most computationally-challenging part of implementing Tukey's HSD test is the evaluation of the cumulative distribution function of the studentized range distribution, which is given by F (q; k, ν) = kν ν/2 Γ(ν/2)2 ν/2−1 ∞ 0 ∞ −∞ s ν−1 e −νs 2 /2 φ(z)[Φ(sq + z) − Φ(z)] k−1 dz ds where q is the studentized range, k is the number of groups, ν is the number of degrees of freedom used to determine the pooled sample variance, and φ(z) and Φ(z) represent the normal probability density function and normal cumulative distribution function, respectively. There is no closed-form expression for this integral, and numerical integration requires care, as naive evaluation of the integrand results in overflow even for modest values of the parameters. Consequently , other packages in the open-source scientific Python ecosystem, such as statsmodels (Seabold & Perktold, 2010) and Pingouin (Vallat, 2018), have relied on interpolation between tabulated values. To satisfy the need for a more accurate implementation of this integral, we contributed scipy.stats.studentized_range (Chmiel et al.,
Cite
CITATION STYLE
Chmiel, D., Wallan, S., & Haberland, M. (2022). tukey_hsd: An Accurate Implementation of the Tukey Honestly Significant Difference Test in Python. Journal of Open Source Software, 7(75), 4383. https://doi.org/10.21105/joss.04383
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.