 Research
 Open Access
 Published:
Distributions associated with simultaneous multiple hypothesis testing
Journal of Statistical Distributions and Applications volume 7, Article number: 9 (2020)
Abstract
We develop the distribution for the number of hypotheses found to be statistically significant using the rule from Simes (Biometrika 73: 751–754, 1986) for controlling the familywise error rate (FWER). We find the distribution of the number of statistically significant pvalues under the null hypothesis and show this follows a normal distribution under the alternative. We propose a parametric distribution Ψ_{I}(·) to model the marginal distribution of pvalues sampled from a mixture of null uniform and nonuniform distributions under different alternative hypotheses. The Ψ_{I} distribution is useful when there are many different alternative hypotheses and these are not individually well understood. We fit Ψ_{I} to data from three cancer studies and use it to illustrate the distribution of the number of notable hypotheses observed in these examples. We model dependence in sampled pvalues using a latent variable. These methods can be combined to illustrate a power analysis in planning a larger study on the basis of a smaller pilot experiment.
Introduction
Much work in informatics is concerned with identifying and classifying statistically significant biological markers. In this work we develop methods for describing the distribution of the numbers of such events. Informatics methods often summarize experiments resulting in a large number of pvalues, usually through multiple comparisons of gene expression data. Typically, the number of tests m, is much greater than the number of subjects, N. There are several important rules for identifying statistically significant pvalues while maintaining the significance level below a prespecified level α (0<α<1). Benjamini (2010) provides a review of recent advances.
A commonly cited rule to control the FWER is the Bonferroni correction. Given a sample of ordered pvalues p_{(1)}≤p_{(2)}≤⋯≤p_{(m)}, the Bonferroni rule finds the smallest value of B=0,1,…,m−1 for which
The Simes (1986) rule chooses the smallest value S=0,1,…,m such that
to control the FWER ≤α.
A similar rule developed by Benjamini and Hochberg (1995) to maintain the false discovery rate (FDR) ≤α finds the largest value of BH such that
This reference shows procedures controlling the FWER also control the FDR, but procedures controlling FDR only control FWER in a weaker sense.
We will concentrate on the distribution of B and S in this report. We describe the probability distribution of B and S under null hypotheses where each pvalue has an independent marginal uniform distribution as well as an approximating distribution under the alternative hypothesis with density function ψ_{I}(p) expressible as a polynomial in logp of order I.
There has been limited research on parametric distributions for the pvalues generated from data under a mixture of the null and different distributions under multiple alternative hypotheses. The mixed pvalues are mainly modeled using nonparametric methods (Genovese and Wasserman 2004; Broberg 2005; Langaas et al. 2005; Tang et al. 2007) or alternatively, the pvalues are converted into normal quantiles and modeled thereafter (Efron et al. 2001; Efron 2004; Jin and Cai 2007). Another common approach is to approximate the distribution of sampled pvalues using a mixture of beta distributions (Pounds and Morris 2003; Broberg 2005; Tang et al. 2007). Other parametric models have been described by Kozoil and Tuckwell (1999); Genovese and Wasserman (2004); Yu and Zelterman (2017, 2019).
Of interest is the fraction π_{0} of pvalues sampled from the uniform distribution under the null hypothesis. Langaas et al. (2005) and Tang et al. (2007) suggest the estimated density of pvalues at p=1 be used to estimate the fraction π_{0}. Estimating π_{0} is of practical importance: The BH statistic controls the FDR no more than απ_{0}. Consequently, Benjamini and Hochberg (2000) recommend we perform tests with significance level α/π_{0} and still maintain the FDR below α. We found \(\,\psi _{I}(1\mid \hat {\boldsymbol {\theta }})\,\) to be a useful estimator of π_{0} in the examples of Section 5, where \(\,\hat {\boldsymbol {\theta }}\,\) denotes the maximum likelihood estimate.
The pvalues are usually not independent. In microarray studies, for example, a small number of clusters of pvalues in the same biological pathway may have high mutual correlations. Methods for modeling such dependencies are developed by Sun and Cai (2009), Friguet et al. (2009), and Wu (2008) for examples.
In Section 2, we describe the probability distribution of S in (2) when the p_{i} are independently sampled from an unspecified distribution Ψ. In Section 3 we examine pvalues sampled from a uniform distribution under the null hypothesis. Section 4 provides elementary properties of the proposed distribution Ψ_{I}. The parameters θ of Ψ_{I} depend on the specific application and are estimated for two examples in Section 5. In Section 6, we model the distribution of dependent pvalues using a latent variable. We combine these methods in Section 7 to illustrate approximate power in planning a proposed study. We provide mathematical details of Sections 2 and 3 in Appendix A. Appendix B examines the behavior of B and S under a close sequence of alternative hypotheses. Appendix C examines the parameter space for the Ψ_{I} distribution.
Simultaneous multiple testing
Let p_{1},p_{2},…,p_{m} denote m randomly sampled p−values with ordered values p_{(1)}≤p_{(2)}≤⋯≤p_{(m)}. We will initially assume all pvalues are independent and have the same distribution function denoted by Ψ(·) with corresponding density function ψ(·). In Section 6, we return to the assumption of independence. We propose a nonuniform approximation for Ψ in Section 4.
If we follow the Bonferroni rule (1) then the distribution of the number B of statistically significant pvalues at FWER ≤α follows a binomial distribution with index m and probability parameter equal to Ψ(α/m).
The distribution of S can be obtained by writing
where U_{0}=1 and
for k=1,2,…,m.
In Appendix A we prove
The pvalues are typically sampled from a mixture of a uniform distribution under the null hypothesis and several distributions under different alternative hypotheses. Similarly, the distribution of S will be a mixture of a mass near zero and a normal distribution, described next. This mixture distribution is illustrated in Figs. 2 and 4 for two examples of Section 5.
Specifically, for values of S near zero we have
To describe the behavior of S away from zero, begin with the quantile function Ψ^{−1}(i/(m+1)) giving the approximate expected value of the order statistic p_{(i)}. If m is large and i/m is not too close to either zero or one, then p_{(i)} will be approximately normally distributed. In (2), S is the smallest value of k for which p_{(k+1)}>(k+1)α/m. This should occur for values of S with mean μ solving
or equivalently,
If we write S=mp_{(μ)}/α for integer μ and use the large sample approximation to an order statistic, then the approximate variance of S is
If the null (uniform) and alternative hypotheses are not very different from each other, then the solution to μ in (6) will be close to zero and Appendix B describes a different approximation to the behavior of B and S.
Behavior under the null hypothesis
Let us next examine the special case where all pvalues are independently sampled under the null hypothesis. When the distribution of the p_{i} are independent and marginally uniformly distributed then (3) and (5) are expressible as
and in general,
Details of the derivation of (7) appear in Appendix A.
Useful results can be obtained if we also assume the number of hypotheses m is large. The limiting distribution (7) of S, is
for k=0,1,….
The probabilities in (8) sum to unity using equation (130) in Jolley (1961, p. 24). The mean of this distribution is α/(1−α) and the variance is α/(1−α)^{3}. The distribution of S+1 in (8) is known as the Borel distribution with applications in queueing theory (Tanner, 1961). Similarly, for large values of m, the number of identified pvalues at FWER ≤α for the Bonferroni criteria (1) will follow a Poisson distribution with mean α when sampling pvalues under the null hypothesis.
Distributions for P−Values
We next propose a marginal distribution Ψ for pvalues, independent of the choice of test statistic. We continue to assume the pvalues are mutually independent and have the same marginal distributions. We must have Ψ concave (Genovese and Wasserman 2004; Sun and Cai 2009), otherwise the underlying test will have power smaller than its significance level for some α. Similarly, the corresponding density function ψ must be monotone decreasing. We next propose a flexible distribution for modeling the distribution of pvalues under alternative hypotheses.
Consider a distribution with a density function expressible as a polynomial in logp up to degree I=0,1,2,…. The uniform (0–1) distribution is obtained for I=0. The marginal density function we propose for pvalues is
for realvalued parameters θ={θ_{1},…,θ_{I}} with I≥1 where
so the densities ψ_{I}(p) integrate to one over 0<p≤1. Similarly, θ_{0} is not an independent parameter.
The corresponding cumulative distribution function is
where β_{0}=1.
The relationship between these parameters is linear:
for j=1,2,…,I and θ_{i}=β_{i}−(i+1)β_{i+1} for i=1,2,…,I−1. Throughout, we will interchangeably refer to either the θ or β parameterizations for simplicity.
The moments of distribution ψ_{I}(p∣θ) are
for j=1,2,….
We must have θ_{I}>0 in order to have ψ_{I}(p)>0 for values of p close to zero. Values of θ_{0} are restricted in (10) in order for ψ_{I}(p) to integrate to unity. Since ψ_{I}(1∣θ)=θ_{0} we must also require θ_{0}≥0. Requiring ψ_{I}(p) to be decreasing at p=1 gives θ_{1}≥0.
These restrictions alone on θ_{0}, θ_{1}, and θ_{I} are not sufficient to guarantee ψ_{I}(p∣θ) is monotone decreasing or positive valued for all values of 0≤p≤1. The necessary conditions for achieving these properties are difficult to describe in general, but sufficient conditions are all θ_{i}≥0. Specific cases are examined in Appendix C for values of I up to I=4. Models for larger values of I could be fitted by maximizing the penalized likelihood, such that ψ_{I}(p∣θ) is positive valued and monotone decreasing at the observed, sorted pvalues.
In practice, the choice of I is found by fitting a sequence of models. Successive values of I represent nested models so twice the differences of the respective loglikelihoods will behave as χ^{2} (1 df) when the underlying additional parameter value is zero. In practice, we found I=3 or 4 were adequate for the three examples in this work.
The ψ_{I} density function is specially suited for modeling the marginal distribution of a uniform and a variety of nonuniform distributions for pvalues. If each p_{i} (i=1,…,m) is sampled from a different distribution with density function ψ_{I}(p∣θ_{i}), then the marginal density of all p_{i} satisfies
where \(\,\overline {\boldsymbol {\theta }}\,\) is the arithmetic average of all θ_{i}. A similar result holds if the values of I vary across distributions of p_{i}.
This mixing of distributions includes the uniform as a special case. Specifically, suppose 100π_{0}−percent of the pvalues are sampled from a uniform (0, 1) distribution (0≤π_{0}≤1) and the remaining 100(1−π_{0})−percent are sampled from ψ_{I}(p∣θ). Then the marginal distribution has density function
demonstrating π_{0} is not identifiable in this model.
Equations (13) and (14) illustrate the utility of ψ_{I} in modeling pvalues sampled from a mixture of the null hypothesis and different distributions under alternative hypotheses, yet retaining the same parametric distribution form. Donoho and Jin (2004) also describe the value of such a mixture of heterogeneous alternative hypotheses in multiple testing settings. Following Langaas et al. (2005); Tang et al. (2007) we use \(\,\psi _{I}(p=1\mid \hat {\boldsymbol {\theta }}) = \hat \theta _{0},\,\) the estimated density at p=1, to estimate π_{0}, the proportion of pvalues sampled from the null hypothesis.
Two examples
For each of the examples in this work, we fitted the density function ψ_{I} described in Section 4 and then used this model to examine the distribution of S given in (3). The fitted parameter values \(\,\hat {\boldsymbol {\theta }}\,\) for these examples are given for successive values of I. We maximized the likelihoods using standard optimization routine nlm in R. This routine also provides estimates of the Hessian used to estimate standard errors of parameter estimates.
The evaluation of U_{k} in (5) involves adding and subtracting many nearly equal values resulting in numerical instability. We computed U_{k} using multiple precision arithmetic with the Rmpfr package in R (Maechler 2019). A third example will be introduced in Section 7, to illustrate estimation of power for multiple hypothesis testing problems.
5.1 Breast cancer
This microarray dataset was originally described by Hedenfalk et al. (2001) and also analyzed by Storey and Tibshirani (2003). These data summarize marker expressions of m=3226 genes in seven women with the BRCA1 mutation and in eight women with the BRCA2 mutation. The objective was to determine differentiallyexpressed genes between these two groups. Earlier analyses used a twosample ttest to compare the two groups for each gene, giving rise to m pvalues. Efron (2004) and Jin and Cai (2007) model the zscores corresponding to the pvalues.
Fitted parameters are given in Table 1. The fitted model for I=2 represents a big improvement over the model with I=1 parameter. The model with I=3 parameters has a modest improvement over the model with I=2 and I=4 demonstrates negligible change in the likelihood over I=3. Fitted densities ψ_{I} for I=2 and 3 are plotted in Fig. 1 along with the observed data. There is only a small difference between the fitted models in this figure, and both exhibit a good fit to the data. Our estimate of π_{0} given by \(\,\hat \theta _{0}\,\) is.65 for I=2 and.62 for I=3. An estimate of.67 for π_{0} is described in Storey and Tibshirani (2003).
There are S=29 statistically significant markers at FWER =.05 using the adjustment for multiplicity given in (2). The fitted distribution of S is displayed in Fig. 2 using \(\,\psi _{3}(\cdot \mid \hat {\boldsymbol {\theta }})\,\). The mean of this fitted distribution is 22.75. The distribution in Fig. 2 appears as a mixture of a distribution concentrated near k=0 and a lefttruncated normal distribution with a local mode at 24. The observed value S=29 is indicated in this figure.
The point mass at S=0 is about 0.1 and values of S≤3 account for about 20% of the distribution with I=3 and fitted \(\,\hat {\boldsymbol {\theta }}\). This distribution is approximately a mixture of the distribution near zero and 80% of a normal with mean 26.1 and standard deviation 17.9 using (6).
5.2 The cancer genome atlas: lung cancer
This dataset contains the summary of an extensive database collected on tumors from N=178 patients with squamous cell lung carcinoma. A full description of these data and the analyses performed are summarized in the Cancer Genome Atlas (2012). The data values were downloaded from the website https://tcgadata.nci.nih.gov/. We choose to examine pvalues representing summaries of statistical comparisons of smokers and nonsmokers across the genetic markers. We identified m=20,068 observed pvalues after omitting about 2% missing values.
Using the Simes procedure, S=173 pvalues are identified with FWER =.05. The fitted parameter values \(\,\hat {\boldsymbol {\theta }}\,\) are given in Table 2. Distributions up to I=4 showed statistically significant improvement in the loglikelihood but larger values of I failed to change it. The fitted density function \(\,\psi _{4}(\cdot \mid \hat {\boldsymbol {\theta }})\,\) given in Fig. 3 demonstrates good agreement with the observed data. The estimate \(\,\hat \theta _{0}\,\) of π_{0} is about.70 for I=4.
The fitted distribution of S given in (3) is plotted in Fig. 4. There is close agreement between the observed value (173), the mean (176.35) of the fitted distribution, and the local mode (177). As with Fig. 2, the fitted distribution of S appears as a mixture of a distribution concentrated near zero and a normal distribution. The local mode at zero gives a fitted Pr[ S≤2 ] of.012. The density mass away from zero is approximately that of a normal distribution with mean 178.8 and standard deviation 39.1 using (6).
Sampling dependent Pvalues
In this section we describe a method for sampling of dependent pvalues by conditioning on an unobservable, latent variable. Greater dependence among the pvalues results in greater means and variances for the distribution of pvalues. This behavior is also described by Owen (2005). Greater dependence also contributes to a larger point mass at zero. We will use the fitted breast cancer example of Section 5.1 to illustrate these methods.
Let θ and ε denote I−tuples such that both θ+ε and θ−ε are valid parameters for the density ψ_{I} described in Section 4. Let Y denote a Bernoulli random variable with parameter equal to 1/2. Conditional on the (unobservable) value of Y, assume all pvalues are sampled from either ψ_{I}(·∣θ+ε) or ψ_{I}(·∣θ−ε). The marginal distribution of these exchangeable pvalues is then ψ_{I}(·∣θ) using (13).
To demonstrate the correlation among the pvalues induced by this latent model, let Q_{1}, Q_{2} denote a random sample from ψ_{I}, both with parameters either θ+ε or θ−ε, conditional on Y. The Q_{i} are conditionally independent given Y and have marginal covariance
where E(p∣θ) is the expected value of ψ_{I}(p∣θ) calculated using (12). This covariance is never negative.
Continuing to sample in this fashion, we then have the marginal distribution
As an illustration, we used \(\boldsymbol {\theta } = \hat {\boldsymbol {\theta }}\,\) and \(\,{\boldsymbol {\epsilon }} = z \hat {{\boldsymbol {\sigma }}}\,\) where \(\,\hat {\boldsymbol {\theta }}\,\) and \(\,\hat {{\boldsymbol {\sigma }}}\,\) are the fitted parameters and their estimated standard errors respectively given in Table 1 for the breast cancer example with I=3. The distributions given by (15) for z=0,.25,.5, and.75 are plotted in Fig. 5. Summaries of these four distributions and the mutual correlations of the pvalues are given in Table 3. As we see in Fig. 2, all distributions in Fig. 5 appear as mixtures of distributions concentrated near zero and a truncated normal distribution, away from zero. Greater dependence results in a larger point mass at zero, as well as larger means and variances of S.
Power for planning studies
In this final section we describe how to plan for a larger project using data from a smaller pilot study. Huang et al. (2015) report on a study of N=78 patients with lung cancer and examined m=48,803 markers to determine if any of these are related to patient survival. None of these markers were identified as statistically significant at α=.05 using the Bonferroni method. A link to their data appears in our References.
We examined their data and the parameter estimates for our fitted models ψ_{I} appear in Table 4. We found the model with I=3 provided the best fit and worked with that maximum likelihood estimate \(\,\hat {\boldsymbol {\theta }}\,\) to model power. We estimate more than 90% of the pvalues were sampled from the null hypothesis in these data.
In order to describe power we will assume the magnitude of the effect, as measured by θ, is proportional to the square root of the subject sample size, as is often the case with parameters whose estimates are normally distributed. This assumption will also require values of θ to lie near the center of the valid parameter space and wouldn’t be valid for extrapolating to extremely large sample sizes. That is, we computed power estimates in Table 5 setting
where N is the proposed patient sample size and used ε=zθ in (15) to vary the dependence among pvalues for values of z=0,.4, and.8.
A variety of sample sizes and correlations are summarized in Table 5. This table summarizes the power as the probability of identifying at least one marker with α=.05. The expected number of identified findings using S is also given in this table.
We estimate the published study by Huang et al. (2015) had about a 50% chance of detecting at least one marker with α=.05. Table 5 suggests increasing sample sizes from 78 to N≥450 patients to achieve power greater than 80% under a model of independent sampling. Even small mutual correlations result in greater point masses at zero, reducing the power of detecting at least one statistically significant pvalues. Another factor is the estimated high proportion of pvalues sampled from the null hypothesis (\(\,\hat \pi _{0} =.908\)). Subsequentn studies should restrict sampling to those markers showing promise in the pilot, as the case in Haynes et al. (2012).
Appendix A: Details of Sections 2 and 3
We define U_{0}=1 in Eq. (4) and
for k=1,2,…,m.
To demonstrate (5), we integrate one term at a time to show
and continue in this manner to demonstrate the recursive relation
given by (5).
To demonstrate (7) for the specific case of Ψ(p)=p we need to show
We will prove (17) by induction on k.
In Section 3 we demonstrate (17) is true for k=0,1,2. Next, we demonstrate if (17) is valid for any k=0,1,…,m−1 then it is also true for k+1.
Begin by using the recursive relation (16) with Ψ(p)=p and (17) for k giving
It remains to show
or equivalently
Continue by writing \(\,{{k+1}\choose {i}} = {{k}\choose {i}} + {{k}\choose {i  1}}\,\) and set j=i−1 giving
The proof of (17) is completed by two applications of the Ruiz Identity (Ruiz, 1996). Specifically,
for all integers k≥0 and all real numbers x.
Appendix B: A close alternative hypothesis
Here we demonstrate the distribution of B and S when a large number of pvalues are independently sampled from Ψ_{I}(p∣β) for I≥1 for values of β close to zero. That is, the null and alternative hypotheses are not very different. Specifically, consider a sequence of parameter values β_{m}=β/(logm)^{I} shrinking to zero. Following (11), we always have β_{0}=1.
Begin by writing
for any fixed γ>0.
When sampling from Ψ_{I}(·∣β_{m}) using the Bonferroni rule (1), set γ=α in (18) to demonstrate the number of statistically significant pvalues B will have an approximate Poisson distribution with mean α(β_{I}+1).
In order to describe the distribution of S we can also use (18) to show
demonstrating
and
More generally, if m pvalues are independently sampled from Ψ_{I}(·∣β/(logm)^{I}) then
for moderate values of k=0,1,… which is the Borel distribution (8) with parameter α(β_{I}+1). The proof of (19) closely follows the proof by induction of (17) in Appendix A.
Appendix C: Parameter space for ψ_{I}(p)
In this Appendix we describe the limits of parameter values for the density function ψ_{I}(p∣θ) defined in (9) for small values of I. Specifically, we must have ψ_{I}(p) nonnegative and monotone decreasing for all 0<p<1.
For all values of I we must have θ_{I}>0 in order for ψ_{I}(p)>0 for values of p close to zero. We must have ψ_{I}(1)=θ_{0} nonnegative so θ_{0}≥0.
Since ψI′(1)=−θ_{1}, in order for ψ_{I} to be monotone decreasing, we must have θ_{1}≥0 for all values of I. The condition of all θ_{i}≥0 is sufficient (but may not be neccessary) for ψ to be monotone decreasing because the Descartes Rule of Signs shows the derivative ψI′(p) of ψ_{I}(p) will have no positive roots in p.
I=1 : If 0≤θ_{1}≤1 then ψ_{1}(p∣θ_{1}) is a valid density and monotone decreasing.
I=2 : We must have (θ_{0}, θ_{1}, θ_{2}) all nonnegative so
For larger values of I, define x=− logp and set \(\,g(x) = \sum \theta _{i} x^{i}\). It is sufficient for g(x)≥0 and g^{′}(x)≥0 for all x≥0 to show ψ is positive and monotone decreasing. For θ_{1}≥0 we have g^{′}(0)≥0 and g^{′}(x)≥0 for all x sufficiently large because θ_{I}>0. To demonstrate g^{′}>0 we need to show g^{″}(x) has no real, positive roots.
I=3 : We must have θ_{3}>0 and θ_{1}≥0. The slope of g(x) does not change sign provided its second derivative g^{″}=6θ_{3}x+2θ_{2} is never negative for all x≥0. This shows θ_{2}>0. The restriction 0≤θ_{0}≤1 gives
I=4 : We have θ_{1}≥0 and θ_{4}>0. If the larger, real root of g^{′′}=12θ_{4}x^{2}+6θ_{3}x+2θ_{2} is negative then
showing θ_{3}>0. Squaring both sides of this inequality shows θ_{2}>0.
If g^{′′} has imaginary roots then \(\, 36\theta _{3}^{2}  96\theta _{2}\theta _{4} <0\,\) so θ_{2}>0 and g^{′′} is never negative. With imaginary roots, if the minimum of g^{′′}(x) occurs at x>0 then ψ_{4}(p) will be decreasing but not concave. The minimum of g^{′′}(x) occurs at x=−θ_{3}/4θ_{4} which is negative leading to θ_{3}>0.
In either real or imaginary roots, for I=4 we have
Availability of data and materials
The data from Section 5.1 is available from the authors with permission from J. Jin and T. Cai. The data for Section 5.2 is available at https://tcgadata.nci.nih.gov/. The data from Section 7 is available at www.biomedcentral.com/content/supplementary/s128590150463xs1.xls.
Abbreviations
 BRCA:

Breast cancer gene
 FDR:

False discovery rate
 FWER:

Familywise error rate
 TCGA:

The National Institutes of Health Cancer Genome Atlas Program
References
Benjamini, Y.: Discovering the false discovery rate. J. R. Stat. Soc. B. 72, 405–16 (2010). https://doi.org/10.1111/j.14679868.2010.00746.x.
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Stat. Soc. B. 57, 289–300 (1995). http://www.jstor.org/stable/2346101.
Benjamini, Y., Hochberg, Y.: On the adaptive control of the false discovery rate in multiple testing with independent statistics. J. Educ. Behav. Stat. 25.1, 60–83 (2000). https://doi.org/10.3102/10769986025001060.
Broberg, P.: A comparative review of estimates of the proportion unchanged genes and the false discovery rate. BMC Bioinformatics. 6, 199–218 (2005). https://doi.org/10.1186/147121056199.
Cancer Genome Atlas Research Network: Comprehensive genomic characterization of squamous cell lung cancers. Nature. 489, 519–25 (2012). https://doi.org/10.1038/nature11404.
Donoho, D., Jin, J.: Higher criticism for detecting sparse heterogeneous mixtures. Ann. Stat. 32, 962–94 (2004). https://doi.org/10.1214/009053604000000265.
Efron, B., Tibshirani, R., Storey, J. D., Tusher, V.: Empirical Bayes analysis of a microarray experiment. J. Am. Stat. Assoc. 96, 1151–60 (2001). https://doi.org/10.1198/016214501753382129.
Efron, B.: Largescale simultaneous hypothesis testing: The choice of a null hypothesis. J. Am. Stat. Assoc. 99, 96–104 (2004). https://doi.org/10.1198/016214504000000089.
Friguet, C., Kloareg, M., Causeur, D.: A factor model approach to multiple testing under dependence. J. Am. Stat. Assoc. 104, 1406–15 (2009). https://doi.org/10.1198/jasa.2009.tm08332.
Genovese, C., Wasserman, L.: A stochastic process approach to false discovery control. Ann. Stat. 32, 1035–61 (2004). https://doi.org/10.1214/009053604000000283.
Haynes, B. F., Gilbert, P. B., McElrath, M. J., ZollaPazner, S., Tomaras, G. D., Alam, S. M., et al.: Immunecorrelates analysis of an HIV1 vaccine efficacy trial. N. Engl. J. Med. 366, 1275–1286 (2012). https://doi.org/10.1056/NEJMoa1113425.
Hedenfalk, I., Duggan, D., Chen, Y., et al.: Geneexpression profiles in hereditary breast cancer. N. Engl. J. Med. 344, 539–48 (2001). https://doi.org/10.1056/NEJM200102223440801.
Huang, H. L., Wu, Y. C., Su, L. J., et al: Discovery of prognostic biomarkers for predicting lung cancer metastasis using microarray and survival data. BMC Bioinformatics. 16, 54 (2015). https://doi.org/10.1186/s128590150463x. Their data is available at www.biomedcentral.com/content/supplementary/s128590150463xs1.xls.
Jin, J., Cai, T. T.: Estimating the null and the proportion of nonnull effects in largescale multiple comparisons. J. Am. Stat. Assoc. 102, 495–506 (2007). https://doi.org/10.1198/016214507000000167.
Jolley, L. B. W.: Summation of Series. Second edition. Dover, New York (1961). ASIN: B01K3IQJ08.
Kozoil, J. A., Tuckwell, H. C.: A Bayesian method for combining statistical tests. J. Stat. Plan. Infer. 78, 317–23 (1999). https://doi.org/10.1016/S03783758(98)002225.
Langaas, M., Lindqvist, B. H., Ferkingstad, E.: Estimating the proportion of true null hypotheses, with application to DNA microarray data. J. R. Stat. Soc. B. 67, 555–72 (2005). https://doi.org/10.1111/j.14679868.2005.00515.x.
Maechler, M.: Rmpfr: R MPFR  Multiple Precision FloatingPoint Reliable (2019). R package version 0.72. https://CRAN.Rproject.org/package=Rmpfr.
Owen, A. B.: Variance of the number of false discoveries. J. R. Stat. Soc. Ser. B. 67, 411–26 (2005). https://doi.org/10.1111/j.14679868.2005.00509.x.
Pounds, S., Morris, S. W.: Estimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of pvalues. Bioinformatics. 19, 1236–42 (2003). https://doi.org/10.1093/bioinformatics/btg148.
Ruiz, S. M.: An algebraic identity leading to Wilson’s Theorem. Math. Gaz.80.489, 579–82 (1996). https://doi.org/10.2307/3618534.
Simes, R. J.: An improved Bonferroni procedure for multiple tests of significance. Biometrika. 73(3), 751–754 (1986). https://doi.org/10.1093/biomet/73.3.751.
Storey, J. D., Tibshirani, R.: Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 100, 9440–5 (2003). https://doi.org/10.1073/pnas.1530509100.
Sun, W., Cai, T. T.: Largescale multiple testing under dependence. J. R. Stat. Soc. Ser. B. 71, 393–424 (2009). https://doi.org/10.1111/j.14679868.2008.00694.x.
Tang, Y., Ghosai, S., Roy, A.: Nonparametric Bayesian estimation of positive false discovery rates. Biometrics. 63, 1126–34 (2007). https://doi.org/10.1111/j.15410420.2007.00819.x.
Tanner, J. C.: A derivation of the Borel distribution. Biometrika. 48, 222–4 (1961). https://doi.org/10.1093/biomet/48.12.222.
Wu, W.: On false discovery control under dependence. Ann. Stat. 36, 364–80 (2008). https://doi.org/10.1214/009053607000000730.
Yu, C., Zelterman, D.: A parametric model to estimate the proportion from true null using a distribution for pvalues. Comput Stat Data Anal. 114, 105–18 (2017). https://doi.org/10.1016/j.csda.2017.04.008.
Yu, C., Zelterman, D.: A parametric metaanalysis. Stat. Med. 38, 4013–25 (2019). https://doi.org/10.1002/sim.8278.
Acknowledgements
The authors thank J. Jin and T. Cai for providing the data analyzed in Section 5.1 and Beth Nichols for a careful reading of the manuscript.
Funding
This work was supported in part by Vanderbilt CTSA grant 1ULTR002243 from NIH/NCATS, R01 CA149633 from NIH/NCI, R01 FD004778 from FDA, R21 HL129020, P01 HL108800 from NIH/NHLBI (CY) and grants P50CA196530, P50CA121974, P30CA16359, R01CA177719, R01ES005775, R01CA223481, R41A120546, U48DP005023, U01CA235747, R35CA197574, and R01CA168733 awarded by the NIH (DZ).
Author information
Affiliations
Contributions
The authors shared equally in all aspects of the creation of the manuscript. The author(s) read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Yu, C., Zelterman, D. Distributions associated with simultaneous multiple hypothesis testing. J Stat Distrib App 7, 9 (2020). https://doi.org/10.1186/s40488020001096
Received:
Accepted:
Published:
Keywords
 Bonferroni correction
 Simes criteria
 False discovery rate
 pvalues