 Research
 Open Access
 Published:
Extended ConwayMaxwellPoisson distribution and its properties and applications
Journal of Statistical Distributions and Applications volume 3, Article number: 5 (2016)
Abstract
A new four parameter extended ConwayMaxwellPoisson (ECOMP) distribution which unifies the recently proposed COMPoisson type negative binomial (COMNB) distribution [Chakraborty, S. and Ong, S. H. (2014): A COMtype Generalization of the Negative Binomial Distribution, Accepted in Communications in StatisticsTheory and Methods] and the generalized COMPoisson (GCOMP) distribution [Imoto, T. :(2014) A generalized ConwayMaxwellPoisson distribution which includes the negative binomial distribution, Applied Mathematics and Computation, 247, 824–834] is proposed. The additional parameter allows this distribution to have longer (shorter) tail compared to COMNB and GCOMP. The proposed distribution can be formulated as an exponential combination of negative binomial and COMPoisson distribution and also arises from a queuing system with state dependent arrival and service rates and belongs to exponential family when one of the parameter is considered as nuisance. Important distributional, reliability and stochastic ordering properties along with asymptotic approximations for the normalizing constant and the mean of this distribution is investigated. Method of parameter estimation and three comparative data fitting applications are also discussed.
1 Introduction
Recently, two new generalizations of the well known COMPoisson (Conway and Maxwell 1962) was proposed. One by Chakraborty and Ong (2014) known as the COMNegative binomial distribution and the other by Imoto (2014) referred to as the generalized COMPoisson Distribution. In this section we briefly introduce these two distributions along with a hypergeometric type series which is used in the sequel.
COMPoisson type negative binomial distribution: Chakraborty and Ong (2014) proposed a new COMPoisson type generalization of negative binomial distribution that includes some wellknown distributions including COMPoisson, Negative Binomial (page 208–250, Chapter 5, Johnson et al. 2005), as particular case and Bernoulli (page 108, Chapter 3, Johnson et al. 2005), COMPoisson as limiting cases among others. This distribution is logconcave and flexible enough to model under, equi and over dispersed count data.
A random variable (rv) X is said to follow the COM  Poisson type Negative Binomial distribution with parameters (v, p, α) [COMNB(v, p, α)] if its pmf is given by
The distribution is defined in the parameter space
When α is a positive integer, _{1} H _{ α − 1}(ν; 1; p) can be expressed as a particular case of generalized hypergeometric series \( {}_mF_n\left({a}_1,{a}_2,\cdots, {a}_m;{b}_1,{b}_2,\cdots, {b}_m;z\right)={\displaystyle \sum_{k=0}^{\infty}\frac{{\left({a}_1\right)}_k{\left({a}_2\right)}_k\cdots {\left({a}_m\right)}_k}{\;{\left({b}_1\right)}_k{\left({b}_2\right)}_k\cdots {\left({b}_n\right)}_k}\;\frac{z^k}{k\;!}} \) as _{1} F _{ α − 1}(ν; 1, 1, ⋯, 1; p).
Generalized COMPoisson distribution: Imoto (2014) proposed another generalization where an rv X is said to follow the GCOMPoisson distribution with parameters (v, p, β) that is GCOMP (v, p, β) if its pmf is given by
The distribution is defined in the parameter space
A hypergeometric type series: We introduce the series
where (a)_{ k } = a(a + 1) ⋯ (a + k − 1) = Γ(a + k)/Γa is the Pochhammer’s notation (see Johnson et al. 2005, chapter 1, page 2). The series converges if (i) for any finite p, β + m − 2 < α or (ii) p < 1, β + m − 2 < α. For α, β and m all positive integers, it reduces to a particular case of the generalized hypergeometric function _{ β + m − 1} F _{ α }(a _{1}, a _{1}, ⋯, a _{1}, a _{2}, ⋯, a _{ m }; b, b, ⋯, b; p). With this notation we have
Some important special cases of \( {}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right) \) are

i.
\( {}_1S_{\alpha 1}^1\left(\nu;\;1;p\right)={}_1H_{\alpha 1}\left(\nu;\;1;p\right) \) [Chakraborty and Ong, 2014]

ii.
\( {}_1S_{{}_0}^{\beta}\left(\nu;\;1;p\right)=C\left(\beta,\;\nu,\;p\right)/{\left(\Gamma \nu \right)}^{\beta } \) [Imoto 2014]

iii.
\( {}_1S_0^1\left(\nu;\;1;p\right)={\left(1p\right)}^{\nu } \) [geometric series]

iv.
\( {}_1S_{{}_{\alpha 1}}^{\beta}\left(1;\;1;p\right)=Z\left(p,\alpha \beta \right) \) [Conway and Maxwell 1962]

v.
$$ {}_1S_{\gamma}^{\gamma}\left(1;\;1;p\right)= \exp (p) $$
Some important limiting cases of \( {}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right) \) are

vi.
$$ \underset{\alpha \to \infty }{ \lim}\kern0.24em {}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right)=1+{\nu}^{\beta }p. $$

vii.
\( \underset{\alpha \to \infty }{ \lim}\kern0.24em {}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right)={\displaystyle \sum_{k=0}^{\infty }{\lambda}^k/{\left(k!\right)}^{\alpha }}=Z\left(\lambda, \alpha \right) \), where ν ^{β} p = λ is finite positive.
In the present article we propose a natural four parameter extension of the COMPoisson distribution which includes the recently introduced COMNB and GCOMPoisson distributions as special cases. This new distribution with additional parameters is more flexible in terms of tail length and dispersion index. The definition of the proposed distribution along with some of its important distributional properties are presented in the Section 2. Reliability and stochastic ordering results are discussed in Section 3. In Section 4 we presented applications of the proposed distribution by considering three real life data sets. Concluding remarks is provided in the Section 5 which if followed by an appendix containing the proofs of the results and propositions in the article.
2 Extended COMPoisson (ECOMP) distribution
Here we introduce a new distribution that unifies both the COMNB and GCOMP distributions.
Definition 1. An rv X is said to follow the extended COMPoisson distribution with parameters (v, p, α, β) [ECOMP (v, p, α, β)] iff its pmf is given by
The distribution is defined in the parameter space
It may be noted that unlike in the COMNB distribution where the parameter α ≥ 1 and in the GCOMP distribution where the parameter β ≤ 1, in the ECOMP distribution these two parameters can be either positive or negative with the restriction of α ≥ β.
Particular cases: The ECOMP (ν, p, α, β) distribution reduces to COMNB (ν, p, α) for β = 1, to GCOMP (ν, p, β) for α = 1, to COMP (p, α − β) for ν = 1, to COMP (p, α) for β = 0, to Poisson (p) for ν = 1, α = β + 1, also to Poisson (p) for β = 0, α = 1, to NB (ν, p) for α = β = 1 and to a new generalization of NB(NGNB) distribution when α = β = γ with pmf
For 0 < ν ≤ 1, the distribution in (7) is logconvex as will be seen in proposition 4 in the Section 2.7.
2.1 Shape of the pmf
It is observed from the plots of the pmf of the ECOMP(v, p, α, β) distribution for different values of the parameters in Fig. 1, that the distribution is very flexible and can be non increasing with mode at zero, unique non zero mode, two modes and also bimodal with one mode always at zero.
2.2 Approximations of the normalizing constant
2.2.1 Approximation using truncation of the series
The normalizing constant \( {}_1S_{\alpha 1}^{\beta}\left(\nu; 1;p\right) \) of the ECOMP(v, p, α, β) distribution is not expressed in a closed form and includes the summation of infinite series. Therefore, we need approximations of this constant to compute the pmf and moments of the distribution numerically.
A simple approximation is to truncate the series, that is
where m is an integer chosen such that ε _{ m } = (ν − m + 1)^{β} p/m ^{α} < 1. The relative truncation error is then given by the expression R _{ m }(ν, p, α, β) \( =\left\{{}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right){}_1S_{\alpha 1,m}^{\beta}\left(\nu;\;1;p\right)\;\right\}/{}_1S_{\alpha 1,m}^{\beta}\left(\nu;\;1;p\right)\;. \) Then the relative error about the pmf is give by {P _{ m }(k) − P(k)}/P(k), where P(k) is given by the right hand side (r.h.s.) of equation (6) in Section 2 and P _{ m }(k) is given by the r.h.s. of (6) with \( {}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right) \) substituted by \( {}_1S_{\alpha 1,m}^{\beta}\left(\nu;\;1;p\right) \). The upper bound of the relative truncation error is then found to be
For α − β ≥ 1, this truncated approximation is good because ε _{ m } = O(1/m) and thus, the truncation point m is not large. However, for 0 < α − β < 1 and p > 1, the truncation point become too large to compute the approximation. For example, when ν = 1.5, p = 3, α = 3.1, β = 3, m has to be over 50,000. This is not practicable. To avoid this difficulty it is useful to make a restriction for the parameter p such that p < 1 when α − β → 0. For example, with the restriction p < 10^{α − β}, we see the relative truncation error R _{50}(1.5, 3, 3.1, 3) < 0.001.
2.2.2 Asymptotic approximation of the normalizing constant using the Laplace’s method
It is also useful to consider an asymptotic approximation formula of the normalizing constant \( {}_1S_{\alpha 1}^{\beta}\left(\nu, 1,p\right) \). The approximation formula by the Laplace’s method (Bleistein and Handelsman 1986, Ch 8.3, pages 331–340) is given by
This formula reduces to the asymptotic formula by Minka et al. (2003) when ν = 1 or β = 0 and that by Imoto (2014) when α = 1. The proof and numerical investigation about the formula (9) are given in Appendix A.1.
2.3 Recurrence relation for probabilities
The ECOMP (ν, p, α, β) pmf has a simple recurrence relation given by
with \( P\left(X=0\right)={\left[{}_1S_{\alpha 1}^{\beta}\left(\nu;\;1;p\right)\;\right]}^{1} \). This will be useful for the computation of the probabilities. Further using (10) we can see that the ECOMP(v, p, α, β) distribution has a longer (shorter) tail than the COMNB(v, p, α) for α < (>)1 and a longer (shorter) tail than the GCOMP(v, p, β) for β > (<)1.
2.4 Exponential family
The pmf in (6) can also be expressed as
Which immediately implies that the ECOMP (ν, p, α, β) distribution belongs to the exponential family with parameters ( log p, α, β) when v, is a nuisance parameter or when its value is given.
2.5 Index of dispersion
The pmf of ECOMP (ν, p, α, β) distribution in (6) can be seen as a weighted Poisson (p) distribution with weight function w(x) = {Γ(ν + x)}^{β}/(Γ(1 + x))^{α − 1}. As such it will be over (under) dispersed if w(x) in logconvex (logconcave). That is if \( \frac{d^2}{d{x}^2} \log \left[w(x)\right]\ge \left(\le \right)\;0 \). [See theorem 4 of Kokonendji et al. 2008]
[On using result 6.4.10 page 260 from Abramowitz and Stegun, 1970].
Hence, ECOMP (ν, p, α, β) is over dispersed (i) if α < 1, β ≥ 0 for all v (ii) if {α ≥ 1, β > 0} or {α < 1, β < 0} when {0 < ν ≤ 1, β ≤ α ≤ β + 1} or {ν > 1, α ≤ 1} and under dispersed (i) if α ≥ 1, β < 0 for all v (ii) for {α ≥ 1, β > 0} or {α < 1, β < 0} if {0 < ν ≤ 1, α ≥ β + 1} or {ν > 1, α ≥ 1}.
As a particular cases of the above result, when β = 1, we can see that the COMNB (ν, p, α) distribution always over dispersed for {0 < ν ≤ 1, 1 ≤ α < 2} or {ν > 1, α = 1} and under dispersed compared to COMP distribution for {0 < ν ≤ 1, α ≥ 2}. Similarly when α = 1, the GCOMP(v, p, β) distribution is seen to be is over dispersed for 0 < β ≤ 1 and under dispersed for β < 0. When ν = 1, we derive that COMP (p, α − β) is over dispersed for α − β > 1 under dispersed for α − β < 1 and equidispersed when α − β = 1. Finally, the new generalized NB distribution with pmf (7) is over dispersed when γ = 1 (which is when it reduces to Negative binomial) and under dispersed if γ > 1.
It can also be checked that ECOMP (v, p, α, β) is over (under) dispersed for α ≥ β > (≤)0 w.r.t. COMNB (v, p, α) and w.r.t. GCOMPoisson (v, p, β) it is over (under) dispersed for β ≤ α < 1 (1 < β ≤ α).
2.6 Different formulations of ECOMP (v, p, α, β)
Two different formulations of the proposed distribution are presented in this section.
2.6.1 ECOMP (v, p, α, β) as a distribution from a queuing set up
Like the COMPoisson distribution, the ECOMP (v, p, α, β) distribution can also be derived as the probability of the system being in the k ^{th} state for a queuing system with state dependent service and arrival rate.
Consider a single server queuing system with state dependent (that is dependent on the system state, k ^{th} state means k number of units in the system) arrival rate λ _{ k } = (ν + k)^{β} λ, and state dependent service rate μ _{ k } = k ^{α} μ , where, 1/μ and 1/λ are respectively the normal mean service and mean arrival time for a unit when that unit is the only one in the system; α and v are the pressure coefficients, reflecting the degree to which the service and arrival rates of the system are affected by the system state.
Proposition 1. Under the above set up where the arrival rate and the service rate increases exponentially as queue lengthens (i.e. as k increases) the probability of the system being in the k ^{th} state is ECOMP (v, p, α, β).
Proof: See Appendix B.1
2.6.2 ECOMP (v, p, α, β) as exponential combination formulation
The general form of the exponential combination of two pmfs say f _{1}(x; θ _{1}) and f _{2}(x; θ _{2}) is given by (Atkinson 1970)
This combining of the pmf was suggested by Cox (1961, 1962) for combining the two hypotheses (β = 1, i.e. the distribution is f _{1} and β = 0 that is the distribution is f _{2}) in a general model of which they would both be special cases. The inferences about β made in the usual way and testing the hypothesis that the value of β is zero or one is equivalent to testing for departures from one model in the direction of the other.
Proposition 2. ECOMP (ν, p, α, β) distribution is an exponential combination NB (v, λ) and COMPoisson (μ, θ) distributions, with λ ^{β} μ ^{1 − β} = p and α = θ(1 − β) + β.
Proof: See Appendix B.2.
From the above formulations it is clear that for ECOMP (v, p, α, β), β close to zero will indicate departure from COMPoisson towards NB, while β close to one will indicate the reverse. Thus ECOMP (v, p, α, β) can also be regarded as a natural extension of COMPoisson, and negative binomial distributions.
2.7 Logconcavity and modality
Proposition 3. The ECOMP (v, p, α, β) has a logconcave pmf when {ν > 1, p > 0, α ≥ β}
Proof: See Appendix B.3.
From the above result the corresponding results of COMNB (v, p, α) and GCOMP (v, p, β) can be obtained as particular cases. That is COMNB (v, p, α) is logconcave when {ν > 1, p > 0, α ≥ 1} and GCOMP (v, p, β) is logconcave when {ν > 1, p > 0, β ≤ 1}.
Following two important results follows as a consequence of logconcavity:
If {ν ≥ 1, p > 0, α > β} the ECOMP (v, p, α, β) distribution is

➢ a strongly unimodal distribution

➢ has an increasing failure rate function
Using the recurrence relation of the probabilities in (10) it is observed that the ECOMP (ν, p, α, β) has

(i)
a non increasing pmf with a unique mode at X = 0 if ν ^{β} p < 1,
e.g. ν = 2, α = 3, β = 2, p should be less than 0.25 to have unique mode at X = 0.

(ii)
a unique mode at X = k if k ^{α}/(ν + k − 1)^{β} < p < (k + 1)^{α}/(ν + k)^{β}
e.g. ν = 2, α = 3, β = 2, p should be between 1.6875 and 2.560 to have unique mode at X = 3.

(iii)
two modes at X = k and X = k − 1 if (ν + k − 1)^{β} p = k ^{α}. In particular the two modes are at X = 0 and X = 1 if ν ^{β} p = 1.
e.g. ν = 2, α = 3, β = 2, p should be equal to 4.408 to have two modes at X = 5 and X = 6.
Graphical illustrations of the above three examples are presented in the first plots of Fig. 1. It is interesting to note that the distribution may be bimodal with one of the mode always at zero as shown the last two plots in Fig. 1.
Proposition 4. ECOMP (v, p, α, β) has a logconvex pmf for {0 < ν ≤ 1, α = β}
Proof. See Appendix B.4.
Following important results follows as a consequence of logconvexity:
If {ν ≤ 1, p > 0, α = β} the ECOMP (v, p, α, β) distribution with pmf in (7)

➢ is Infinitely divisible (see Warde and Katti 1971) distribution, hence Discrete Compound Poisson distribution. (see page 409 of GómezDéniz et al. 2011)

➢ has an decreasing failure rate function, hence increasing mean residual life function

➢ has an upper bound for variance as p ν ^{β} (using result of page 410 of GómezDéniz et al. 2011)
2.8 Moments
The r ^{th} factorial moment E(X ^{[r]}) = μ ^{[r]} of the ECOMP (v, p, α, β) is given by
where the second expression in terms of hypergeometric function is for the case when α, β are both positive integers.
Since the ECOMP (v, p, α, β) distribution is a member of exponential family (see Section 2.4), the mean is given by differentiating the logarithm of the normalizing constant with respect to p. Hence an asymptotic approximation for the mean is obtained by differentiating the logarithm of the function (9) as
This function approximates the mean of the ECOMP (v, p, α, β) distribution for large p and small α − β, where it is difficult to compute the approximation by truncation. A numerical illustration of this asymptotic approximation is presented in the Appendix A.2.
3 Reliability characteristics and stochastic ordering
3.1 Survival and failure rate functions
The survival function is given by
Alternatively, S(t) can also be expressed as
The failure rate function is given by
where the second expression in terms of hypergeometric function is for the case when α, β are positive integers.
3.2 Stochastic orderings
An rv X with pmf P(X = n) is said to be smaller than another rv Y pmf P(Y = n) in the likelihood ratio order that is X ≤ _{ lr } Y if P(Y = n)/P(X = n) increases in n over the union of the supports of X and Y. Again X ≤ _{ lr } Y implies X is smaller than Y in the hazard rate order and subsequently in the mean residual (MRL) life order (see Gupta et al. 2014).
Theorem 1. X ~ ECOMP (ν, p, α, β) is smaller than Y ~ COMNB (ν, p, α) in the likelihood ratio order i.e. X ≤ _{ lr } Y when β < 1.
Proof: If X ~ ECOMP (v, p, α, β) and Y ~ COMNB (v, p, α), then
This is clearly increasing in n as β < 1 (Definition 1.C.1 of Chapter 1, Shaked and Shanthikumar 2007 and Gupta et al. 2014). Hence the result is proved.
As an implication of theorem 1, we get X ≤ _{ hr } Y ⇒ X ≤ _{ MRL } Y, for β < 1.
Theorem 2. X ~ ECOMP (v, p, α, β) is smaller than Y ~ GCOMP (v, p, β) in the likelihood ratio order i.e. X ≤_{ lr } Y when α > 1.
Proof: If X ~ ECOMNB (v, p, α, β) and Y ~ GCOMP (v, p, β), then
This is clearly increasing in n as α > 1 (Definition 1.C.1 of Chapter 1, Shaked and Shanthikumar 2007 and Gupta et al. 2014). Hence the result is proved.
As an implication of theorem 2, we get X ≤ _{ hr } Y ⇒ X ≤ _{ MRL } Y, for α > 1.
4 Numerical examples
To fit the proposed distribution, we have to estimate the parameters (v, p, α, β) in (6). The maximum likelihood (ML) estimation is often used for fitting to real data, but the log likelihood function of the proposed distribution
where f _{ i } is the observed frequency of i ^{th} observed value(event), \( N={\displaystyle \sum_{i=1}^k{f}_i} \), k is the highest observed value, has some local maximum points for some datasets, or the likelihood equations do not always have unique solution. Therefore, we use the profile likelihood estimation. We first consider the maximum likelihood estimation by fixing the parameter v and finding the maximum point \( \left({\widehat{p}}_{\nu },{\widehat{\alpha}}_{\nu },{\widehat{\beta}}_{\nu}\right) \) of the function (13). The maximum point \( \left({\widehat{p}}_{\nu },{\widehat{\alpha}}_{\nu },{\widehat{\beta}}_{\nu}\right) \) is uniquely determined because the proposed distribution belongs to the exponential family when v is fixed. For finding \( \left({\widehat{p}}_{\nu },{\widehat{\alpha}}_{\nu },{\widehat{\beta}}_{\nu}\right) \) computationally, it is convenient to use some initial values. The simple initial values can be obtained as follow. Putting c _{ x } = P(X = x + 1)/P(X = x) and d _{ x } = log(c _{ x + 1}/c _{ x }), where X is the rv following ECOMP (ν, p, α, β) distribution, we have the equation
For given v, we choose the integer k such that A _{ k }(v) ≠ 0 and put
where P(X = x) is substituted with f _{ x } in d _{ x }. Then we can obtain the initial values \( \left({\overset{\sim }{p}}_k(v),{\overset{\sim }{\alpha}}_k(v),{\overset{\sim }{\beta}}_k(v)\right) \) for (p, α, β) as
where l is the lowest observed value (e.g. l = 0 for neither censored nor truncated data). These values are available even for the truncated version of ECOMP (v, p, α, β) distribution. Then by studying the behavior of \( \mathrm{L}\left(\nu, {\widehat{p}}_{\nu },{\widehat{\alpha}}_{\nu },{\widehat{\beta}}_{\nu}\right) \) with v varying, we find the range of v where the function will give the global maximum. For the range, the maximum point of the function (13) gives the ML estimates \( \left(\widehat{\nu},\widehat{p},\widehat{\alpha},\widehat{\beta}\right) \).
By using this method, we fit the proposed distribution to three datasets and compare with NB (r, p), COMP (θ, p) COMNB (v, p, α) and GCOMP (v, p, β). Simultaneously, we fit Delaporte distribution, which is derived from the convolution of a NB (r, p) and Poisson (λ) rv, and some mixed Poisson distributions; mixing with generalized gamma distribution of Agarwal and Kalla (1996) with parameters (δ, m, α, n), mixing with generalized inverse Gaussian gamma distribution of Jorgensen (1982) with parameters (χ, η, ω, λ), mixing with generalized exponential distribution of Ong and Lee (1986) with parameters (v, a, d, β). These distributions are derived as the generalized negative binomial distributions and used for longtailed count data. The detailed studies are given in Gupta and Ong (2005). Here we show only the best fitting distribution among these distributions in Gupta and Ong (2005). The performances of various distributions are compared using the χ ^{2} goodness of fit and the Akaike Information Criterion (AIC). Following Burnham and Anderson (2004) we look at the difference Δ_{ i } = AIC_{ i } − AIC_{min} where AIC_{min} is the minimum of the AIC values of the all the fitted model and AIC_{ i } is that of the i ^{th} model. According to Burnham and Anderson (2004), models having Δ_{ i } ≤ 2 had substantial support (evidence) and those in which 4 ≤ Δ_{ i } have considerably less support. For computing the χ ^{2} goodness of fit statistics we group the cells whose expected number is less than 5 such that the expected number of grouped cell is not less than 5.
4.1 The spots in southern pine beetle
The first example is the frequency distribution of Corbet’s Malayan Buttery with zeros (Corbet 1942). Corbet caught altogether 620 species, but he also estimated that the total buttery fauna of the area contained 924 species, so that 304 species were missing from the collection and treated as count zero. In this dataset, the counts more than 24 are grouped as 25+, so we use the loglikelihood function of the form
where X is the rv of the fitted distribution.
Comparing the performance of the distributions presented in Table 1, we see that the Delaporte distribution gives best and marginally better fit than the ECOMP distribution in terms of AIC and χ ^{2} goodness of fit but looking at the value Δ_{ i } suggests that the ECOMP distribution also has substantial support (evidence) for the data. Both theses two distributions give much better fittings for the count 0, 1 and the tail part 25+ compared to the rest.
More over for it can be observed, the ML estimate \( \widehat{\alpha} \) of the COMNB distribution and ML estimate \( \widehat{\beta} \) of the GCOMP distribution show these two distributions reduce to the negative binomial distribution, while the proposed ECOMP distribution does not seem to reduce to the negative binomial distribution. Actually, the likelihood ratio test for H _{0}: Negative binomial distribution (α = β = 1) Vs H _{1}: ECOMP distribution (α ≠ 1 or β ≠ 1) rejects the negative binomial distribution (pvalue is 0.001). So the ECOMP distribution brings in substantial improvement in fitting this data set over both COMNB and GCOMP distributions.
4.2 The spots in southern pine beetle
The second example is the frequency data of the number of spots (k) in southern pine beetle, Dentroctonus frontails Zimmerman, (Coleopetra: Scolytidae), in Southeast Texas (Lin 1985). Table 2 shows the fitting results and PoiGE means the mixed Poisson distribution with generalized exponential distribution. From χ ^{2} goodness of fit and AIC, the GCOMP distribution gives the best fitting among fitted distributions. However, a look at the value Δ _{ i } suggests that the ECOMP distribution gives equally good fitting to the data. From the estimated parameters of the ECOMP distribution, we can see that fitted ECOMP distribution reduces to the new generalization of NB distribution given in equation (7) with estimated parameters \( \widehat{\nu}=0.002,\;\widehat{p}=0.69,\;\widehat{\gamma}=0.28 \). Further, by virtue of proposition 2 in Section 2.6.2 and we can conclude that fitted ECOMP distribution reduces to an exponential combination of NB (0.003, λ) and Geometric (μ) in the ratio 0.28:0.72, where λ and μ can be calculated using the formula given in the Section 2.6.2.
4.3 Borrowing library books
The third example shows the number of books that were borrowed k times (k ≥ 1) from the long loan collection at Sussex University over the period of a year (Burrell and Cane 1982). For fitting to this dataset, we consider the zerotruncation of each distribution. Table 3 shows the fitting results. From χ ^{2} goodness of fit and AIC, the zerotruncation of the ECOMP distribution gives best fitting among fitted distributions. Studying the values of Δ_{ i } suggest that the COMNB distribution also has good support (evidence) while rest of the models have considerably less support for the data.. Here we interpret the size of queue in Section 2.6.1 as the popularity of books. Then, from the estimated parameters of the ECOMP distribution, we see that new interest is hard to increase but the popularity is hard to decrease for the book which is borrowed many times. This might be because, according as a book is borrowed more times, there are fewer opportunities to borrow the book.
5 Concluding remarks
Extended ConwayMaxwellPoisson distribution proposed here unifies the COMNB and GCOMP which were recently introduced to add more flexibility to the COMPoisson distribution. The proposed distribution with additional parameter has more flexibility in terms of its tail behavior and dispersion level. Further it also arises from queuing theory set up and as exponential combination of negative binomial and COMPoisson distribution and has many interesting properties. It is therefore envisaged that ECOMP distribution has the potential in modeling varieties of count data.
References
Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. 9^{th} Print. Dover, New York (1970)
Agarwal, S.K., Kalla, S.L.: A generalized gamma distribution and its application in reliability. Commun. Stat. Theory Methods 25, 1, 201–210 (1996)
Atkinson, A.C.: A method for discriminating between models. J. R. Stat. Soc. Series B (Methodological) 32, 3, 323–353 (1970)
Bleistein, N., Handelsman, R.A.: Asymptotic expansions of integrals. Dover, New York (1986)
Burnham, K.P., Anderson, D.R.: Multimodel InferenceUnderstanding AIC and BIC in Model Selection. Sociol. Methods Res. 33, 2, 261–304 (2004)
Burrell, Q.L., Cane, V.R.: The analysis of library data. J. R. Stat. Soc., Series A 145, 439–471 (1982)
Chakraborty, S., Ong, S.H. A COMtype generalization of the negative binomial distribution, Accepted in April 2014, (available on line since 07 November 2015) to appear in Communications in StatisticsTheory and Methods
Conway, R.W., Maxwell, W.L.: A queueing model with state dependent service rates. J Industrl Engng 12, 132–136 (1962)
Corbet, A.S.: The distribution of butteries in the Malay peninsula. Proc. R. Entomol. Soc. London, Series A, General Entomology 16, 101–116 (1942)
Cox, D.R.: Tests of separate families of hypotheses. Proc. 4th Berkeley Symp. 1, 105–123 (1961)
Cox, D.R.: Further results on tests of separate families of hypotheses. J. R. Statist. Soc. B 24, 406–424 (1962)
GómezDéniz, E., María Sarabia, J., CalderínOjeda, E.: A new discrete distribution with actuarial applications. Insur. Math. Econ. 48, 406–412 (2011)
Gupta, R.C., Ong, S.H.: Analysis of longtailed count data by Poisson mixtures. Commun. Stat. Theory Methods 34, 557–574 (2005)
Gupta, P.L., Gupta, R.C., Tripathi, R.C.: On the monotonic properties of discrete failure rates. J. Stat. Plan. Inference 65, 255–268 (1997)
Gupta, R.C., Sim, S.Z., Ong, S.H.: Analysis of discrete data by ConwayMaxwell Poisson distribution. AStA Adv. Stat. Anal. 98, 327–343 (2014)
Imoto, T.: A generalized ConwayMaxwellPoisson distribution which includes the negative binomial distribution. Appl. Math. Comput. 247, 824–834 (2014)
Johnson, N.L., Kemp, A.W., Kotz, S.: Univariate discrete distributions. Wiley, New York (2005)
Jorgensen, B.: Statistical properties of the generalized inverse Gaussian distribution. Lecture Notes in Statistics, SpringerVerlag, New York (1982)
Kokonendji, C.C., Mizère, D., Balakrishnan, N.: Connections of the Poisson weight function to over dispersion and unde rdispersion. J. Stat. Plan. Inference 138, 1287–1296 (2008)
Lin, SK.: Characterization of lightning as a disturbance to the forest ecosystem in East Texas. M.Sc. thesis. Texas A & M University, College Station (1985)
Minka, T.P., Shmueli, G., Kadane, J.B., Borle S., and Boatwright, P.: Computing with the COMPoisson distribution. Technical Report: 776, Department of Statistics, Carnegie Mellon University, http://repository.cmu.edu/cgi/viewcontent.cgi?article=1174&context=statistics . (2003)
Ong, S.H., Lee, P.A.: On a generalized noncentral negative binomial distribution. Commun. Stat. Theory Methods 15, 1065–1079 (1986)
Shaked, M., Shanthikumar, J.G.: Stochastic orders. Springer Verlag, New York (2007)
Warde, W.D., Katti, S.K.: Infinite divisibility of discrete distributions II. Ann. Math. Stat. 42, 3, 1088–1090 (1971)
Acknowledgments
The corresponding author Prof. Subrata Chakraborty would like to thank the Editors –inChief Prof. Felix Famoye andProf. Carl Lee, for the invitation to write a paper for this esteemed Journal. Both the authors acknowledge the comments and suggestions of the editor and both the reviewers which lead to substantial improvement in the presentation of the work.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
SC conceptually developed the proposed distribution with related mathematical results of the paper and drafted the manuscript. TI developed the Sections 2.2, 2.8 and 4 of the manuscript. Both authors read and approved the final manuscript.
Appendix
Appendix
A. Approximations of the normalizing constant \( {}_1S_{\alpha 1}^{\beta}\left(\nu; 1;p\right) \) and the mean
A.1 Asymptotic approximation of the normalizing constant \( {}_1S_{\alpha 1}^{\beta}\left(\nu; 1;p\right) \) using the Laplace’s method
Defining \( i=\sqrt{1} \), we have the identity for nonnegative integers n and k
This leads to the identities
From these two identities, we get the formula for integer values α ≥ 0 and β ≤ 0
Changing the variables iz _{ l } = ix _{ l } + log p/(α − β) and then applying the Laplace’s method for approximation of multiple integral, we obtain the formula (9).
The formula (9) has been derived for an integer values α ≥ 1 and − β ≥ 0, but numerical studies suggest that it holds for 0 < α − β < 1 and p > 1, where it is difficult to compute \( {}_1S_{\alpha 1}^{\beta}\left(\nu, 1,p\right) \) by truncated approximation (8). The Table 4 gives the percentage errors \( 100\left\{{}_1\overset{\sim }{S}_{\alpha 1}^{\beta}\left(\nu;\;1;p\right){}_1S_{\alpha 1,m}^{\beta}\left(\nu;\;1;p\right)\;\right\}/{}_1S_{\alpha 1,m}^{\beta}\left(\nu;\;1;p\right) \), with m = 18000 such that R _{ m }(ν, p, α, β) < 10^{− 28}, where \( {}_1\overset{\sim }{S}_{\alpha 1}^{\beta}\left(\nu;\;1;p\right) \) is the r.h.s. of the formula (9)
A.2 Asymptotic approximation of the mean
A numerical illustration of the performance of the asymptotic approximation formula of mean in equation (12) is provided in Table 5.
B. Proof of the propositions
B.1 Proof of proposition 1
Following Conway and Maxwell (1962), the system differential difference equations are given by
and
Let λ/μ = p. Then from (14) and (15) we get
Now as Δ → 0 we get
Assuming a steady state (i.e. \( {P}_k^{/}(t)=0 \) for all k) we get
Putting k = 1 we get
Similarly, for k = 2 we get
In general, \( {P}_k(t)=\frac{{\left\{{\left(\nu \right)}_k\right\}}^{\beta }}{{\left(k!\right)}^{\alpha }}{p}^k\;{P}_0(t) \), where \( {P}_0(t)=1/{\displaystyle \sum_{i=0}^{\infty}\left\{\frac{{\left\{{\left(\nu \right)}_i\right\}}^{\beta }}{{\left(i!\right)}^{\alpha }}{p}^i\right\}} \).
Since we have assumed a steady state (i.e. \( {P}_k^{/}(t)=0 \) for all k) P _{ k }(t) can be replaced by P _{ k }.
B.2 Proof of proposition 2
The probability function resulting from the exponential combination of NB (v,λ) and COMPoisson (μ, θ) is given by
substituting λ ^{β} μ ^{1 − β} = p and α = θ(1 − β) + β
This is the pmf of ECOMP (v, p, α, β).
B.3 Proof of proposition 3
For a distribution to be logconcave we must have (see Gupta et al. 1997)
For ECOMP (v, p, α, β), \( \Delta\;\eta (t)=p\frac{{\left(\nu +t\right)}^{\beta }{\left(t+2\right)}^{\alpha }{\left(\nu +t+1\right)}^{\beta }{\left(t+1\right)}^{\alpha }}{{\left(t+1\right)}^{\alpha }{\left(t+2\right)}^{\alpha }} \)
B.4 Proof of proposition 4
ECOMP (v, p, α, β) has a logconvex probability mass function if Δ η(t) ≤ 0. That is
Since α ≥ β the inequality in (16) cannot hold for ν > 1.
Now for 0 < ν ≤ 1 the inequality in (16) implies
⇒ α/β ≤ 1. Which implies α = β since α ≥ β.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keyword
 COMPoisson
 COMNegative binomial
 Generalized COMPoisson
 State dependent service and arrival rate Queues
 Laplace method
Mathematics Subject Classification (2010)
 62E15
 60K25
 62N05