Generalized fiducial inference on the mean of zero-inflated Poisson and Poisson hurdle models

Zero-inflated and hurdle models are widely applied to count data possessing excess zeros, where they can simultaneously model the process from how the zeros were generated and potentially help mitigate the effects of overdispersion relative to the assumed count distribution. Which model to use depends on how the zeros are generated: zero-inflated models add an additional probability mass on zero, while hurdle models are two-part models comprised of a degenerate distribution for the zeros and a zero-truncated distribution. Developing confidence intervals for such models is challenging since no closed-form function is available to calculate the mean. In this study, generalized fiducial inference is used to construct confidence intervals for the means of zero-inflated Poisson and Poisson hurdle models. The proposed methods are assessed by an intensive simulation study. An illustrative example demonstrates the inference methods.


Introduction
The Poisson distribution is arguably one of the most commonly used models for count data. As such, a large number of inferential tools are available for Poisson-based models, such as for the ratio of two Poisson rates (Gu et al. 2008), Poisson regression models (Cameron and Trivedi 1990), and Poisson point processes (Itô 2015). Assuming the Poisson as an underlying distribution for parametric modeling can be a fairly strong assumption since one must be willing to posit that their data are equi-dispersed. In practice, count data almost ubiquitously demonstrate over-dispersion, which can be attributed to, for example, (spatio-)temporal dependency, unexplained heterogeneity, and/or excess zeros (Cameron and Trivedi 2013).
One of the earliest papers to address the problem of excess zeros was (Mullahy 1986), who proposed a two-part model that permits a more flexible data-generating process: zeros are from a binomial distribution while positive values are from a truncated distribution. Such a model can accommodate under-and over-dispersion. The model using a Zou et al. Journal of Statistical Distributions and Applications (2021) 8:5 Page 2 of 15 zero-truncated Poisson is often called the Poisson hurdle (PH) model. Later, the seminal paper of (Lambert 1992) extended this phenomenon of excess zeros to the count regression setting, but also framed the problem differently with respect to how the zeros were generated. Specifically, a certain number of zeros are expected to be generated according to the assumed count distribution (random zeros) while the excess zeros are assumed to be generated from a separate, degenerate process (structural zeros). This framework results in a zero-inflated model, which is a two-component mixture model with one component for the assumed count distribution and the second component a degenerate distribution at zero. In the work of (Lambert 1992), the development was in the context of zero-inflated Poisson (ZIP) regression models. Regardless, both PH and ZIP models accommodate the notion of excess zeros in a Poisson setting, but how the zeros are generated is treated differently under the two models. Moreover, both models tend to have comparable performance regarding goodness-of-fit measures, which underscores how the application should provide the guidance in determining the way the zeros are generated. The more complex data setting posed by zero-inflation opens the door to additional inference considerations, many coupled with their own challenges. For example, there is a bevy of score tests developed for testing the presence of zero-inflation in various count data settings; cf. van den Broek (1995); Janaskul and Hinde (2002); Janaskul and Hinde (2008); Cao et al. (2014); Todem et al. (2018). Bhattacharya et al. (2008) used a general Bayesian setup for detecting if zero-inflation is present in the data, however, it is challenging to justify the selection of the prior distribution. Score-based tests are also available for testing the presence of overdispersion, which can be caused by zeroinflation; cf. (Ridout et al. 2001;Hall and Berenhaut 2002;Deng and Paul 2005). With the exception of large-sample-based approaches for constructing confidence intervals on regression parameters in zero-inflated regression and hurdle regression models, there is no panacea for constructing reliable, accurate confidence intervals for other parameters in their non-regression counterparts, such as the population mean of univariate ZIP and PH distributions.
Deriving confidence intervals for a more complex data setting, like the presence of excess zeros, is challenging in the frequentist setting. Typically, one resorts to normalbased theory, but finite sample properties can be highly unreliable. In particular, Waguespack et al. (2020) assessed Wald-based confidence intervals for the ZIP mean. Their simulation results showed more liberal results for smaller n. As an alternative, they proposed constructing a bootstrap-based confidence interval for the ZIP mean, which had coverage probabilities much closer to the nominal level. They also conducted a signed likelihood ratio test (SLRT) for testing the ZIP mean, which controlled the type I error rate satisfactorily. Bayesian approaches suffer from the challenge to justify the selection of the prior distribution, just like we noted with the work of Bhattacharya et al. (2008) earlier. Alternatively, one can consider fiducial inference as proposed by (Fisher 1935). Fiducial inference struggled to gain popularity among statisticians because of perceived deficiencies in the general approach. However, later works have developed more sophisticated procedures coupled with rigorous theory to mitigate such criticisms, all while reflecting the core tenets of the fiducial paradigm. For example, (Weerahandi 1995) introduced generalized confidence intervals (GCIs) constructed by generalized pivotal quantities (GPQs), and (Hannig et al. 2006) further established the connection between GCI and (2021) 8:5 Page 3 of 15 the fiducial argument of Fisher. For the purposes of our study, we turn to generalized fiducial distributions as they often lead to attractive solution with asymptotically correct frequentist coverage levels. Moreover, many simulation studies have shown that generalized fiducial solutions have very good small sample properties; see, for example, (Hannig 2009) and(E et al. 2008). There is also some work on fiducial approaches for discrete distributions. For example, (Mathew and Young 2013) developed fiducial tolerance intervals for functions of discrete random variables, while (Hannig et al. 2016) presented an extensive summary about computing the generalized fiducial distribution for parameters of some common discrete distributions. In this paper, we shall consider using the fiducial inference for the mean of ZIP and PH distributions. This paper is organized as follows. In Section 2, we give a brief sketch of generalized fiducial inference, with emphasis on the discrete data setting. In Section 3, we derive the respective fiducial distributions of the ZIP mean and PH mean. In Section 4, we present a numerical study to illustrate the good coverage probabilities of GCIs for the ZIP mean and PH mean constructed using fiducial inference, and demonstrate that the fiducial test has comparable performance to the SLRT when conducting the ZIP mean test. An analysis of urinary tract infection data is presented in Section 5. In Section 6, we make some concluding remarks.

Generalized fiducial inference
The aim of generalized fiducial inference is to define a distribution for parameters of interest that contains all of the information from data. Therefore, inference on the parameters can be made through this distribution. The tenet of generalized fiducial inference is to switch the role of the parameters and the data. We now briefly explain the philosophy of generalized fiducial inference.
Suppose that data Y are generated through the structural equation Y = G(ξ , U), where ξ is a vector of parameters and U is some random variable with a known distribution independent of the parameter ξ . The structural equation can be regarded as a data generation process where the noise process U and the signal ξ will produce observed data Y. Hence, the distribution of Y can be determined via the structural equation given a fixed parameter ξ and the distribution U. After the data Y are observed, we can switch the position of the data and parameters by solving the structural equation conditioned on that the solution to that equation exists. Thus, we can get ξ = Q(Y , U). For more details regarding this setup, we refer to (Hannig 2009).

Generalized fiducial inference on discrete data
Let Y now be a discrete random variable with the distribution function F(·|θ). We know that if U ∼ U (0, 1), data following the distribution F(·|θ) can be generated through Y = F − (U|θ), where F − (a|θ) = inf{y : a ≤ F(y|θ)} is the inverse function. According to the philosophy of generalized fiducial inference, we need to solve the data generating equation to get the parameter as a function of the data and a known random distribution. Assume for each fixed y, the distribution is a nonincreasing function of θ. It follows that Q + y (u) = sup{θ : F(y|θ) = u} and Q − y (u) = inf{θ : F(y − |θ) = u} exist and satisfy F(y|Q + y (u)) = F(y − |Q − y (u)) = u. Moreover, the closure of the inverse image is fiducial distribution for the parameter, so that the fiducial sample of the parameter has 50% chance from either the upper or lower bound. For the fiducial distributions of multiple parameters, a two-stage method can be applied based on the minimal sufficient statistics. Let the parameters of interest be ξ = (ξ 1 , ξ 2 ). Assume that the following two conditions hold: 1 Ifξ 2 is known, there is a statistic S 1 = S 1 (ξ 2 ) that has an invertible pivotal relationship with ξ 1 . 2 A statistic S 2 exists that S 2 and ξ 2 have an invertible pivotal relationship.
Then we can obtain the fiducial distribution of ξ 2 followed by the fiducial distribution of ξ 1 given that ξ 2 is known.

Fiducial distribution of ZIP mean
The ZIP distribution has probability mass function where I {A} (z) is the indicator function that z belongs to the set A. The following proposition establishes the minimal sufficient statistic for a ZIP distribution:

a random sample from a ZIP distribution. Denote the sum of the random sample as S and the number of zeros of the random sample as K, where S
According to the factorization theorem, (S, K) is sufficient. Now we want to show (S, K) is minimal sufficient. Assume that we have another sample Y = (Y 1 , . . . , Y n ) T with corresponding realizations y = (y 1 , . . . , y n ) T . The ratio of the two density functions is where f is a function that does not depend on the parameters. The ratio is free of (π, λ) if and only if We also need the following proposition regarding sums of zero-truncated Poisson distributions: Proof The proof follows by mathematical induction, and can be found in Springael and Van Nieuwenhuyse (2006).
Denote the distribution function of the sum of m zero-truncated Poisson(λ) by F 1 (k|m, λ), the distribution of Poisson with mean parameter λ by F P (k|λ), and the distribution function of Binomial(n, p) random variables by F B (k|n, p). It follows that We will use the inverse distribution functions as a data generating equation: where U 1 , U 2 are independent U (0, 1). When K = n, the value of S is set as 0.
After observing k and s, and inverting the data generating equation, we see that Thus, the sample from the fiducial distribution is obtained by sampling (U * 1 , U * 2 ), and using the above inequalities to solve for π and λ. Consequently, when the parameter of interest is μ = (1 − π)λ, the mean of the ZIP distribution, we have if k < n. When k = n then 0 ≤ μ ≤ ∞. Finally, we need to select a representative region for the fiducial sample. Following (Hannig et al. 2016), we choose a 50-50 mixture of the upper and lower bound. In the case of k = n, this results in a 50-50 mixture of 0 and ∞.
The algorithm for constructing a fiducial confidence interval for a ZIP mean is implemented as follows: . , x n ) T be a sample of size n from a ZIP distribution with Poisson parameter λ and binomial parameter π. Calculate the number of zeros k = n i=1 I {0} (x i ) and the sum of the sample s = n i=1 x i . 2 Generate a realization U * 1 and U * 2 independently from U (0, 1). Then, calculate the realizations is the distribution function of the sum of m zero-truncated Poisson(λ). 3 Repeat step 2 B times, yielding 2B fiducial samples of the ZIP mean μ, denoted by M * L1 , · · · , M * Ln , M * U1 , · · · , M * Un . The 1 − α two-sided GCI of the ZIP mean will be the lower and upper α/2 quantiles of the fiducial samples.

Fiducial distribution of PH mean
The derivation of the fiducial distribution for the PH model follows that for the ZIP model mutatis mutandis. The PH distribution has probability mass function: Note that after reparameterization, a ZIP distribution characterized by the binomial parameter μ 1 and the Poisson parameter λ 1 can be expressed as a PH distribution that is characterized by the binomial parameter μ 2 = μ 1 + (1 − e −λ 1 ) and the truncated Poisson parameter λ 2 = λ 1 . Hence, the likelihoods of the two models are equivalent. The selection of the model should be based on how zeros are generated. Note that this equivalency does not hold in the ZIP regression and PH regression settings. In those settings, the likelihoods are based on a conditional distribution (i.e., Y given some covariates) for determining the estimates of the regression parameters. While the final likelihoods will typically be similar, they will not be equal.
The following proposition establishes the minimal sufficient statistic for a PH distribution: Proposition 3.3 Let X = (X 1 , X 2 , . . . , X n ) T be a random sample from a PH distribution. Denote the sum of the random sample as S and the number of zeros of the random sample as K, where S = n i=1 X i and K = n i=1 I {0} (X i ). Consequently the minimal sufficient statistic is (S, K).
Proof The proof is identical to the proof of Proposition 3.1, and is therefore omitted.
Immediately, we see that K ∼ Binomial(n, π), and (S | K = k) has the same distribution as n−k i=1 Y i , where Y i are independent Poisson(λ) random variables conditioned on the event {Y i ≥ 1}. We will then use the inverse distribution functions as a data generating equation where U 1 , U 2 are independent U (0, 1). When K = n the value of S is again set as 0.
After observing k and s, and inverting the data generating equation, we see that where B a,b (u) and H m,s (u) are as defined for the ZIP setting. Thus, the sample from the fiducial distribution is obtained by sampling (U * 1 , U * 2 ) and using the above inequalities to solve for π and λ. Consequently, when the parameter of interest is μ = (1−π)λ 1−e −λ , the mean of the PH distribution, we have , if k < n. When k = n then we again have 0 ≤ μ ≤ ∞. Thus, it turns out that the fiducial distribution of the mean of the ZIP and the mean of the PH are the same.
Finally, just as in the ZIP setting, the selection of a representative region for the fiducial sample is to choose a 50-50 mixture of the upper and lower bound. In the case of k = n, this again results in a 50-50 mixture of 0 and ∞. The algorithm for constructing a GCI for a PH mean is the same as the ZIP setting, so it is omitted here.

Simulation study
We next assess the performance of the GCI just presented through an extensive simulation study. We also compare our results to the bootstrap confidence intervals constructed using the approach in Waguespack et al. (2020). Note that we do not include a comparison with the likelihood-based (i.e., Wald-based) confidence intervals since Waguespack et al. (2020) already demonstrated the relative superior performance of the bootstrap confidence intervals. The sample sizes used to assess the finite sample performance of the GCI include n ∈ {15, 30, 100}. For the parameters, the mixture proportion π is selected from {0.2, 0.5, 0.8} and the mean λ of the Poisson distribution is selected from {1, 5}. The simulation settings for the PH distribution are the same as for the ZIP distribution: sample sizes n ∈ {15, 30, 100}, mixture proportions π ∈ {0.2, 0.5, 0.8}, and mean of the Poisson distribution λ ∈ {1, 5}. Note that when π = 0 in the ZIP setting or π = e −λ in the PH setting, the data are actually simulated from the Poisson distribution Poisson(λ). Moreover, we demonstrate the performance of our approach when there is no under-/over-dispersion in the data. Specifically, the same values of n and λ are considered, but no (2021) 8:5 Page 8 of 15 mixing proportion is present (or equivalently π = 0). The number of Monte Carlo samples for our simulations is set to 10,000 and the number of fiducial samples used is 1000. We also drew 10,000 bootstrap samples to construct the bootstrap confidence intervals. For each simulation scenario, we estimated the probability Q(X) = P(R M (X) < M|X), where M is the mean of the distribution. If the generalized fiducial inference were exact, then Q(X) should follow a standard uniform distribution, which could be examined through Q − Q plots. In the results that follow, coverage probabilities and the median widths of the GCIs are reported. We report the median widths due to using a 50-50 mixture of 0 and ∞ as the fiducial distribution of λ, which has an expected value of ∞. Furthermore, we compare type I error rates and the power for both the SLRT proposed by Waguespack et al. (2020) and our fiducial test for testing the ZIP mean. The simulation is set up similarly as in Waguespack et al. (2020) to enable a side-by-side comparison: sample sizes n ∈ {30, 40, 50}, mixture proportions π ∈ {0.1, 0.3, 0.5}, and mean of the Poisson distribution λ ∈ {1, 1.3, 1.6, 2}. The first set of simulation results is for the ZIP distributions. The results are given in Table 1. As we can see, for the different simulation scenarios, the coverage probabilities for the bootstrap confidence intervals are typically liberal, especially for the sample sizes less than 100. Meanwhile, the coverage probabilities for the GCIs are all noticeably closer to the nominal level except for the sample size n = 15 and λ = 1. Under such a setting, the simulated samples are almost all zeros, thus compromising the inference. Even though the GCIs are conservative in this setting, they are closer to the nominal level compared to the more liberal bootstrap confidence intervals. The median widths are, of course, narrower for the bootstrap confidence intervals, but that is a result of the procedure being noticeably liberal relative to the nominal level. The Q − Q plots for the different sample sizes are given in Fig. 1. As the sample size n increases, the agreement between the actual p-value and the nominal p-value improves.
The second set of simulation results is for the PH distributions. The results are given in Table 2. We again obtain similar results as in the ZIP setting for different simulation scenarios. The coverage probabilities are close to nominal for the GCIs, whereas the bootstrap confidence intervals are again noticeably liberal. In fact, the setting with sample size n = 15 and λ = 1 appears to be doing slightly better here in the PH setting compared to the analogous results in the ZIP setting. The Q − Q plots for different sample sizes are given in Fig. 2. The same asymptotic behavior identified in the ZIP setting is also observed from the present simulation results; specifically, as the sample size n increases, the agreement between the actual p-value and the nominal p-value improves.
The third set of simulation results we consider is for the Poisson distribution. The results are given in Table 3. As noted earlier, the Poisson is just a special case of the ZIP and PH distributions such that there are no excessive zeros. Again, for different simulation scenarios, the coverage probabilities are close to nominal. For n = 15 and n = 30, there is a clear improvement using the GCIs compared to the bootstrap confidence intervals, but for large n, the procedures are comparable. This illustrates that regardless if zero-inflation is present in the data, our method is still appropriate for constructing a confidence interval of the mean. The Q − Q plots for different sample sizes are given in Fig. 3. Only moderate discrepancies are noticeable when the sample size is small (n = 15) or moderate (n = 30); however, the tail behavior appears to be very good. Since we want to construct a 95% confidence interval, it is not a concern as long as the tails are accurate,     which is confirmed by our results in Table 3. The asymptotic behavior is also observed from the simulation results: as the sample size n increases, the agreement between the actual p-value and the nominal p-value improves.
The last set of simulation results is to estimate the type I error rates and power of the SLRT and fiducial test for testing the following for the ZIP mean under α = 0.05: H 0 : μ ≤ μ 0 versus H a : μ > μ 0 . The μ 0 is assumed to be 1 − π and the true ZIP mean is μ = (1 − π)λ. Therefore, the type I error rate could be estimated when λ = 1. The type I error rates and power of the SLRT and fiducial test are reported in Table 4. Generally, the performance of the two tests are almost identical. The type I error rates are all close to the nominal level for both tests while the power is increasing as λ or the sample size gets larger.

Application: urinary tract infection data
We construct GCIs for a ZIP distribution fit to data on urinary tract infections (UTIs). Note that our fiducial approach will generate the same GCI no matter if the data follows a ZIP or PH distribution. Therefore, the UTI data could also serve as an example for constructing a GCI for a PH distribution. These data came from n = 98 HIV-infected men who were treated by the Department of Internal Medicine at the Utrecht University Hospital in the Netherlands. The frequency of times those patients had a UTI was recorded as X. The frequency table is given in Table 5. The data were analyzed in (van den Broek 1995), who used a score test to detect if zero-inflation exists. Later, (Bhattacharya et al. 2008) and (Bayarri et al. 2008) applied Bayesian testing to test for zero-inflation. All  F r e q u e n c y 8 1 9 7 1 9 8 of these analyses favor a ZIP distribution. Moreover, the use of a zero-inflated distribution is appropriate because the zeros are likely arising from two subgroups of men: one group that are otherwise healthy aside from having HIV (structural zeros), and one group that has a history of other issues with their urinary system (e.g., kidney stones) and, thus, could be at higher risk of eventually developing a UTI (random zeros). The fiducial sample is set to be 10,000. The 95% GCI for the average number of UTIs that the patients have is (0.157, 0.434). The 95% SLRT confidence interval is (0.157, 0.426), which is close to the GCI, while the bootstrap confidence interval with 10,000 bootstrap samples is (0.143, 0.398). Even though one can easily calculate the sample mean from these data (x = 0.266) and infer that, for example, the average number is less than 1, the approach we have presented now affords us with the additional insights that accompany confidence interval interpretations, such as the reliability of our estimate of the mean and how far the spread of that interval falls away from a particular value of interest. We also know from the coverage study in the previous section that the median width of the 95% GCI will be noticeably wider than the respective 95% bootstrap confidence interval. Practically speaking, this could have implications on the hospital's treatment plans for these UTIs. If treatment plans are benchmarked against the 95% bootstrap confidence intervals, then smaller and larger values beyond the respective limits of that interval will be omitted from such plans, whereas those values will be reflected via the 95% GCI.

Conclusion
In this article, generalized fiducial inference on ZIP and PH distributions was studied for the first time and applied to a healthcare dataset. The practical contribution of this method is that one can now easily calculate and report a confidence interval along with an estimate of the mean if using either a ZIP or PH model. The theoretical advantage of this method is that it achieves good small sample properties except for when the zero proportion π is large and the Poisson parameter λ is small. Also, it does not depend on the selection of priors like Bayesian inference, but it only relies on the data generation equation. A simulation study demonstrated that, for the confidence interval of the mean of ZIP and PH distributions, the generalized fiducial inference works very well for various scenarios. Since the Poisson distribution can be considered as a special case of ZIP or PH distribution, the simulation also shows our method for ZIP and PH distributions can accommodate constructing the confidence interval of the mean of a Poisson distribution. Thus, if the goal is only to construct a confidence interval for the mean of the count data, our approach can be applied directly since it will not be necessary to detect for zero-inflation or decide if the data are under-/over-dispersed. Furthermore, our fiducial approach performed equally well as the SLRT when testing the ZIP mean.
We note that there is some computational limitation of the proposed method since it involves finding the root of a sum of factorials. When the sample size is large or the Poisson mean parameter is large, the computational effort could become prohibitive. Uniformly valid approximation exists for Stirling numbers of the second kind (Temme 1993), which can alleviate some of the computational burden, but this can translate to worse results for coverage probabilities. Such simulation results are not shown here. We also note that generalized fiducial inference can be very similar to Bayesian inference when a fiducial distribution is obtained. We highlighted earlier that (Bhattacharya et al. 2008) used Bayesian inference to test for the presence of zero-inflation. Future research will be focused on extending the use generalized fiducial inference for selecting the model between Poisson distribution and ZIP/PH models. Moreover, there are broader inference considerations when fitting ZIP/PH regression models, such as joint confidence intervals on the regression parameters and simultaneous confidence intervals over the values of the covariate space. The utility of such inference is underscored by the recent emphasis placed on marginalized ZI regression models, which focuses on modeling the mean response across the two states (Long et al. 2014;Todem et al. 2016;Martin and Hall 2017). These are further extensions worth considering in the generalized fiducial framework.