 Research
 Open Access
 Published:
The Kumaraswamygeometric distribution
Journal of Statistical Distributions and Applications volume 1, Article number: 17 (2014)
Abstract
In this paper, the Kumaraswamygeometric distribution, which is a member of the Tgeometric family of discrete distributions is defined and studied. Some properties of the distribution such as moments, probability generating function, hazard and quantile functions are studied. The method of maximum likelihood estimation is proposed for estimating the model parameters. Two real data sets are used to illustrate the applications of the Kumaraswamygeometric distribution.
AMS 2010 Subject Classification: 60E05; 62E15; 62F10; 62P20
Introduction
Eugene et al. ([2002]) introduced the betagenerated family of univariate continuous distributions. Suppose X is a random variable with cumulative distribution function (CDF) F(x), the CDF for the betagenerated family is obtained by applying the inverse probability transformation to the beta density function. The CDF for the betagenerated family of distributions is given by
where B(α,β)=Γ(α)Γ(β)/Γ(α+β). The corresponding probability density function (PDF) is given by
Eugene et al. ([2002]) used a normal random variable X to define and study the betanormal distribution. Following the paper by Eugene et al. ([2002]), many other authors have defined and studied a number of the betagenerated distributions, using various forms of known F(x). See for example, betaGumbel distribution by Nadarajah and Kotz ([2004]), betaWeibull distribution by Famoye et al. ([2005]), betaexponential distribution by Nadarajah and Kotz ([2006]), betagamma distribution by Kong et al. ([2007]), betaPareto distribution by Akinsete et al. ([2008]), betaLaplace distribution by Cordeiro and Lemonte ([2011]), betageneralized Weibull distribution by Singla et al. ([2012]), and betaCauchy distribution by Alshawarbeh et al. ([2013]), amongst others. After the paper by Jones ([2009]), on the tractability properties of the Kumaraswamy's distribution (Kumaraswamy [1980]), Cordeiro and de Castro ([2011]) replaced the classical beta generator distribution with the Kumaraswamy's distribution and introduced the Kumaraswamy generated family. Detailed statistical properties on some Kumaraswamy generated distributions include the Kumaraswamy generalized gamma distribution by de Pascoa et al. ([2011]), Kumaraswamy loglogistic distribution by de Santana et al. ([2012]) and Kumaraswamy Gumbel distribution by Cordeiro et al. ([2012]). Alexander et al. ([2012]) replaced the beta generator distribution with the generalized beta type I distribution. The authors referred to this form as the generalized betagenerated distributions (GBGD) and the generator has three shape parameters.
The above technique of generating distributions is possible, only when the generator distributions are continuous and the random variable of the generator lies between 0 and 1. In a recent work by Alzaatreh et al. ([2013b]), the authors proposed a new method for generating family of distributions, referred to by the authors as the TX family, where a continuous random variable T is the transformed, and any random variable X is the transformer. See also Alzaatreh et al. ([2012a], [2013a]). These works opened a wide range of techniques for generating distributions of random variables with supports on . The TX family enables one to easily generate, not only the continuous distributions, but the discrete distributions as well. As a result, Alzaatreh et al. ([2012b]) defined and studied the Tgeometric family, which are the discrete analogues of the distribution of the random variable T.
Suppose F(x) denotes the CDF of any random variable X and r(t) denotes the PDF of a continuous random variable T with support [a, b]. Alzaatreh et al. ([2013b]) gave the CDF of the TX family of distributions as
where R(t) is the CDF of the random variable T, W(F(x))∈ [ a,b] is a nondecreasing and absolutely continuous function. Common support [ a,b] are [ 0,1], (0,∞), and (∞,∞). Alzaatreh et al. ([2013b]) studied in some details the case of a nonnegative continuous random variable T with support (0,∞). With this technique, it is much easier to generate any discrete distribution. If X is a discrete random variable, the TX family, is a family of discrete distributions, transformed from the nonnegative continuous random variable T. The probability mass function (PMF) of the TX family of discrete distributions may now be written as
The Tgeometric family studied in Alzaatreh et al. ([2012b]) is a special case of (4) by defining W(F(x))= ln(1F(x)). The rest of the paper is outlined as follows: Section 2 defines the Kumaraswamy geometric distribution (KGD). In Section 3, we discuss some properties of the distribution. In Section 4, the moments of KGD are provided, while Section 5 contains the hazard function and the Shannon entropy. In Section 6, we discuss the maximum likelihood method for estimating the parameters of the distribution. A simulation study is also discussed. Section 7 details the results of applications of the distribution to two real data sets with comparison to other distributions, and Section 8 contains some concluding remarks.
The Kumaraswamygeometric distribution
Following the TX generalization technique by Alzaatreh et al. ([2013b]), we allow the transformed random variable T to have the Kumaraswamy's distribution, the transformer random variable X to have the geometric distribution, and W(F(x))=F(x).
Kumaraswamy ([1980]) proposed and discussed a probability distribution for handling doublebounded random processes with varied hydrological applications. Let T be a random variable with the Kumaraswamy's distribution. The PDF and CDF are defined, respectively, as
where both α>0 and β>0 are the shape parameters. The beta and Kumaraswamy distributions share similar properties. For example, the Kumaraswamy's distribution, also referred to as the minimax distribution, is unimodal, uniantimodal, increasing, decreasing or constant depending on the values of its parameters. A more detailed description, background and genesis, and properties of Kumaraswamy's distribution are outlined in Jones ([2009]). The author highlighted several advantages of the Kumaraswamy's distribution over the beta distribution, namely; its simple normalizing constant, simple explicit formulas for the distribution and quantile functions, and simple random variate generation procedure.
The geometric distribution, also referred to as the Pascal distribution, is a special case of the negative binomial distribution. It is thought of as the discrete analogue of the continuous exponential distribution (Johnson et al. [2005]). Many characterizations of the geometric distribution are analogous to the characterization of the exponential distribution. The geometric distribution has been used extensively in the literature in modeling the distribution of the lengths of waiting times. If X is a random variable having the geometric distribution with parameter p, the PMF of X may be written as
where p is the probability of success in a single Bernoulli trial. The CDF of the geometric distribution is given by
The Kumaraswamygeometric distribution (KGD) is defined by using Equation (3) with a=0, where the random variable T has the Kumaraswamy's distribution with the CDF (6) and the random variable X has the geometric distribution with the CDF (8). Since the random variable T is defined on (0,1), we use the function W(F(x))=F(x) in (3) to obtain the CDF of KGD as
The corresponding PMF for the KGD now becomes
by using Equation (4). Thus, a random variable X having the PMF expressed in Equation (10) is said to follow the Kumaraswamygeometric distribution with parameters α,β and q, or simply X~ KGD (α,β,q). One can show that the PMF in Equation (10) satisfies $\sum _{0}^{\infty}g\left(x\right)=1$ by telescopic cancellation.
It is interesting to note that the KGD can be generated from a different random variable T and a different W(F(x)) function. Suppose a random variable Y follows the Kumaraswamy's distribution in (5), then its PDF is
Suppose we define a new random variable as T= ln(1Y). By using the transformation technique, the PDF of T is given by
The corresponding CDF is given by
A random variable T with the CDF in (12) will be called the logKumaraswamy's distribution (LKD). We are unable to find any reference to this distribution in the literature. However, it is a special case of the logexponentiated Kumaraswamy distribution studied by Lemonte et al. ([2013]). By using the LKD and the TX distribution by Alzaatreh et al. ([2013b]), we can define the logKumaraswamygeometric distribution (LKGD) by using Equation (3), where T follows the LKD, X follows the geometric distribution and W(F(x))= ln(1F(x)). By using 1F(x)=q^{x+1} and  ln(1F(x))= lnq^{x+1}, the probability mass function of LKGD can be obtained as
which is the same as the KGD in (10) defined by using Kumaraswamy's and geometric distributions. The LKGD, and hence the KGD, is the discrete analogue of logKumaraswamy's distribution.
Special cases of KGD
The following are special cases of KGD:

(a)
When α=β=1, the KGD in (10) reduces to the geometric distribution in (7) with parameter p.

(b)
When α=1, the KGD with parameters α, β and q reduces to the geometric distribution with parameter p _{*}, where p _{*}=1q ^{β}.

(c)
When β=1, the KGD reduces to the exponentiatedexponentialgeometric distribution (EEGD) discussed in Alzaatreh et al. ([2012b]).
It is easy to verify that ${lim}_{\mathit{\text{x}}\infty}G\left(x\right)=1$. The plots of the PMF of the KGD for various values of α,β and q are given in Figure 1.
Some properties of Kumaraswamygeometric distribution
Suppose X follows the KGD with CDF G(x) in (9). The quantile function X_{*}(=Q(U),0<U<1) of KGD is the inverse of the cumulative distribution. That is,
where U has a uniform distribution with support on (0,1). Equation (14) can be used to simulate the Kumaraswamygeometric random variable. First, simulate a random variable U and compute the value of X_{*} in (14), which is not necessarily an integer. The Kumaraswamygeometric random variate X is the largest integer ≤X_{*}, which can be denoted by [ X_{*}].
3.0.0.0 Transformation: The relationship between the KGD and the Kumaraswamy's, exponential, exponentiatedexponential, Pareto, Weibull, Rayleigh, and the logistic distributions are given in the following lemma.
Lemma1.
Suppose [ v] denotes the largest integer less than or equal to the quantity v.

(a)
If Y has Kumaraswamy's distribution with parameters α and β, then the distribution of X=[logq(1Y)] is KGD.

(b)
If Y is standard exponential, then X=[logq{1(1e ^{Y/β})^{1/α}}] has KGD.

(c)
If Y follows an exponentiatedexponential distribution with scale parameter λ and index parameter c, then X=[logq{1[1(1e ^{λY})^{c/β}]^{1/α}}] has KGD.

(d)
If the random variable Y has a Pareto distribution with parameters θ, k and CDF $F\left(y\right)=1{\left(\frac{\theta}{\theta +y}\right)}^{k}$, then $X=\left[\underset{q}{log}\left\{1{\left(1{\left(\frac{\theta}{\theta +Y}\right)}^{k/\beta}\right)}^{1/\alpha}\right\}\right]$ has KGD.

(e)
If the random variable Y has a Weibull distribution with F(y)=1 exp{(y/γ)^{c}} as CDF, then X=[logq{1(1 exp[(Y/γ)^{c}/β])^{1/α}}] has KGD.

(f)
If the random variable Y has a Rayleigh distribution with $F\left(y\right)=1exp\left[\frac{{y}^{2}}{2{b}^{2}}\right]$ as CDF, then $X=\left[\underset{q}{log}\left\{1{\left(1exp\left[\frac{{Y}^{2}}{2{b}^{2}\beta}\right]\right)}^{1/\alpha}\right\}\right]$ has KGD.

(g)
If Y is a logistic random variable with F(y)=[1+ exp({(ya)/b})]^{1} as CDF, then X=[logq{1(1[1+ exp{(Ya)/b}]^{1/β})^{1/α}}] has KGD.
Proof.
By using the transformation technique, it is easy to show that the random variable X has KGD as given in (10). We will show the result for part (a). Let R be the CDF of the Kumaraswamy's distribution.
which is the PMF of the KGD in (10).
In general, if we have a continuous random variable Y and its CDF is F(y), then X=[logq{1(1F^{1/β}(Y))^{1/α}}] has KGD.
3.0.0.0 Limiting behavior: As x→∞, ${lim}_{\mathit{\text{x}}\to \infty}g\left(x\right)=0$. Also, as x→0, ${lim}_{x\to 0}g\left(x\right)=1{[\phantom{\rule{0.3em}{0ex}}1{(1q)}^{\alpha}]}^{\beta}$. This limit becomes 0 if q→1 and/or α→∞. Thus, the distribution starts with probability zero or a constant probability as evident from Figure 1.
3.0.0.0 Mode of the KGD: Since the KGD is also LKGD, a Tgeometric distribution, we use Lemma 2 in Alzaatreh et al. ([2012b]), which states that a Tgeometric distribution has a reversed Jshape if the distribution of the random variable T has a reversed Jshape. We only need to show when the distribution of logKumaraswamy distribution has a reversed Jshape.
On taking the first derivative of (11) with respect to t, we obtain
where V(t)=α β e^{t} (1e^{t})^{α2} [1 (1e^{t})^{α}]^{β2} is positive. For β≥1 and α≥ 1, it is straight forward to show that Q(t)≤ 0. For β<1 and α≤1, the function Q(t) is an increasing function of t. It is not difficult to show that ${lim}_{t\to 0}Q\left(t\right)=\alpha 1\le 0$ and ${lim}_{t\to \infty}Q\left(t\right)=0$. Thus, for α≤ 1 and any value of β and q, Q(t)≤0 and so the PDF of the logKumaraswamy distribution is monotonically decreasing or has a reversed Jshape. Hence, the KGD has a reversed Jshape and a unique mode at x=0 when α≤1.
When α>1, it is not easy to show that the KGD is unimodal. However, through numerical analysis of the behavior of the PMF, and its plots in Figure 1 for various values of β and q, we observe, that for values of α>1, the KGD is concave down or has a reversed Jshape with a unique mode.
Moments
Using Equation (10), the r^{th} raw moment is given by
The two inner summations terminate at α, if α is a positive integer. When β=1 in the above, we have,
In particular, let r=1 and α=1, the expression for the first moment, or the mean of the KGD may be written as,
which is the mean of the geometric distribution, a special case of KGD.
We discuss in what follows, an alternative approach of expressing the PMF of the KGD in Equation (10).
Using Equation (16), it is now easy to write the expressions for the moment, moment generating function, and probability generating function for the KGD respectively as follows:
Equation (17) is equivalent to the series $\sum _{x=0}^{\infty}{x}^{r}g\left(x\right)$, where g(x) is given by (10). Observe that the series is absolutely convergent by using the ratio test and hence the series in (17) is absolutely convergent. Thus, interchanging the order of summation has no effect. Using Equation (17), the r^{th} moment may be written as
where β^{(i)}=β(β1)(β2)…(βi+1), and similarly for (α i)^{(j)}. Also,
is the polylogarithm function, (http://mathworld.wolfram.com/Polylogarithm.html).
Expressions for the first few moments are thus:
The expression for the variance may be written as
Expressions for the skewness and kurtosis for the KGD may be obtained by combining appropriate expressions in Equations (20), (21), (22), and (23). In the particular case for which α=1=β, the expressions for the central moments of the geometric distribution are as follows:
The results for this special case may be found in standard textbooks on probability. See for example, Zwillinger and Kokoska ([2000]).
Both the moment generating function (M(t)) and the probability generating function (φ(t)) can be simplified further. In the case of φ(t), we have
After further simplification, the above reduces to,
By letting
the first two factorial moments may be expressed as
In general,
which reduces to the result in Alzaatreh et al. ([2012b]) when β=1.
Through numerical computation, we obtain the mode, the mean, the standard deviation (SD), the skewness and the kurtosis of the KGD. The values of α and β for the numerical computation are from 0.2 to 10 at an increment of 0.1, while the values of q are from 0.2 to 0.9 at an increment of 0.1. For brevity, we report the mode, the mean and the standard deviation in Table 1 and the skewness and kurtosis in Table 2 for some values of q, β and α. From the numerical computation, the mean, mode and standard deviation are increasing functions of q. From Table 1, the mean, mode and standard deviation are decreasing functions of β but increasing functions of α. For α≤ 1, the skewness and kurtosis are decreasing functions of q but increasing functions of β. For α>1, the skewness and kurtosis first decrease and then increase as both q and β increase. The skewness and kurtosis are decreasing functions of α. Some of these observations can be seen in Table 2 while others are from the numerical computation. Instead of Tables 1 and 2, contour plots may be used to present the results in the tables. However, it becomes difficult to see the patterns described above.
Hazard rate and Shannon entropy
The hazard rate function is defined as
where $G\left(x\right)=\sum _{y=0}^{x}g\left(y\right)$. For the KGD, we have, after substituting expressions for the PMF and CDF (Equations (10) and (9)),
The asymptotic behaviors of the hazard function are such that,
and in particular, ${lim}_{x\to 0}h(x;\alpha =1=\beta )=p/q=1/\text{E}\left(X\right)$. Also, ${lim}_{x\to \infty}h\left(x\right)={q}^{\beta}1={L}_{2}$, after using the L'Hôspital's rule. This result generalizes the limiting behavior of the hazard rate function for the EEGD discussed in Alzaatreh et al. ([2012b]). Observe that L_{1}>L_{2} when α<1. For α<1, we check the behavior of h(x). The function h(x) is monotonically decreasing when α<1 if h(x)≥h(x+1) for all x. When α=1, observe that h(x) is a constant. For values of α<1, we numerically evaluate d(x)=h(x)h(x+1) for α and q from 0.1 to 0.9 at an increment of 0.1. All the values are positive, which indicates that the function h(x) is monotonically decreasing. Similarly, we analytically evaluate d(x) for small values of α>1 and the difference d(x) is always negative. Numerically, we use the values of q from 0.1 to 0.9 at an increment of 0.1 with values of α from 1.5 to 10.0 at an increment of 0.5. All the d(x) values are negative which indicates that the function h(x) is monotonically increasing. Thus, we have a decreasing hazard rate when α<1 and an increasing hazard rate when α>1. For α=1, L_{1}=L_{2} and we have a constant hazard rate. The graphs of the hazard rate function defined in Equation (24) are shown in Figure 2 for various values of the parameters. We see in Figure 2 that the hazard rate decreases for values of α<1 and increases for α>1.
The entropy of a random variable is a measure of variation of uncertainty. For a discrete random variable X with probability mass function g(x), the Shannon entropy is defined as,
In probabilistic context, (x) is a measure of the information carried by g(x), with higher entropy corresponding to less information. Substituting Equation (10) in Equation (25), we have
Suppose we write the PMF as
Let α=1 for simplicity. We may now write the entropy as
After some algebra, Equation (26) becomes,
On setting β=1 in Equation (27), we have, for the geometric distribution,
Note that when β=1, p=q=1/2, (x)=2. It is not difficult to show that (x) is an increasing function of q for any given β. This is consistent with the pattern of the standard deviation. We also note that ${lim}_{\beta \to \infty}\mathcal{S}\left(x\right)=0$, with the proviso that 0 log0=0. This indicates that smaller values of β increase the uncertainty in the distribution, while higher values of β increase the amount of information measured in terms of the probability. Actually, a zero entropy indicates that all information needed is measured solely in terms of the probability. In a way, the KGD has smaller entropy (more probabilistic information) than the geometric distribution for values of β>1.
Maximum likelihood estimation
We discuss the maximum likelihood estimation of the parameters of the KGD in subsection 6.1. Subsection 6.2 contains the results of a simulation that is conducted to evaluate the performance of the maximum likelihood estimation method.
Estimation
Let a random sample of size n be taken from KGD, with observed frequencies n_{ x }, x=0,1,2,…,k, where $\sum _{x=0}^{k}{n}_{x}=n$. From Equation (10), the likelihood function for a random sample of size n may be expressed as
The loglikelihood function is
Differentiating the loglikelihood function with respect to the parameters, we obtain
where,
Setting the nonlinear Equations (30), (31) and (32) to zero and solving them iteratively, we get the estimates $\widehat{\mathbf{\theta}}={(\widehat{\alpha},\widehat{\beta},\widehat{q})}^{T}$ for the parameter vector θ=(α,β,q)^{T}. The initial values of parameters α and β can be set to 1 and that of parameter q can be set to 0.5.
For interval estimation and hypothesis tests on the parameters, we require the information matrix $\mathcal{I}\left(\mathbf{\theta}\right)$, with elements ∂^{2}l/(∂ i ∂ j)=l_{ ij }, where, i,j∈{α,β,q}. Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, the asymptotic distribution of $\sqrt{n}(\widehat{\mathbf{\theta}}\mathbf{\theta})\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}\text{is}\phantom{\rule{1em}{0ex}}\phantom{\rule{1em}{0ex}}{N}_{3}(0,{\mathcal{I}}^{1}(\mathbf{\theta}\left)\right)$. The asymptotic multivariate normal distribution ${N}_{3}(0,{\mathcal{I}}^{1}(\widehat{\mathbf{\theta}}\left)\right)$ of $\widehat{\mathbf{\theta}}$ can be used to construct approximate confidence intervals for the parameters. For example, the 100(1ξ)% asymptotic confidence interval for the i^{th} parameter θ_{ i } is given by
where _{i,i} is the i^{th} diagonal element of ${\mathcal{I}}^{1}\left(\mathbf{\theta}\right)$ for i=1,2,3, and z_{ξ/2} is the upper ξ/2 point of standard normal distribution. See for example, Mahmoudi ([2011]).
The information matrix $\mathcal{I}\left(\mathbf{\theta}\right)$ is of the form,
The expressions for the elements l_{ ij } are given in the Appendix.
Simulation
A simulation study is conducted to evaluate the performance of the maximum likelihood estimation method. Equation (14) is used to generate a random sample from the KGD with parameters α, β and q. The different sample sizes considered in the simulation are n=250, 500 and 750. The parameter combinations for the simulation study are shown in Table 3. The combinations were chosen to reflect the following cases of the distribution: underdispersion (α=6, α=4, q=0.8), overdispersion (all other cases), monotonically decreasing (α=1.6, β=2.0, q=0.6), and unimodal with mode greater than 0 (all other cases). For each parameter combination and each sample size, the simulation process is repeated 100 times. The average bias (actual  estimate) and standard deviation of the parameter estimates are reported in Table 3. The biases are relatively small when compared to the standard deviations. In most cases, as the sample size increases, the standard deviations of the estimators decrease.
Applications of KGD
We apply the KGD to two data sets. The first data set is the observed frequencies of the distribution of purchases of a brand X breakfast cereals purchased by consumers over a period of time (Consul [1989]). The other data set is the number of absences among shiftworkers in a steel industry (Gupta and Ong [2004]). Comparisons are made with the generalized negative binomial distribution (GNBD) defined by Jain and Consul ([1971]) and the exponentiatedexponential geometric distribution (EEGD) defined by Alzaatreh et al. ([2012b]).
Purchases by consumers
Consul ([1989]), p. 128 stated that, "The number of units of different commodities purchased by consumers over a period of time", appears to follow the generalized Poisson distribution (GPD). In support of this assertion, the author analyzed some relevant data sets and observed that the GPD model provided adequate fits. One of the data sets considered by Consul ([1989]) consists of observed frequencies of the distribution of purchases of a brand X breakfast cereals. The original data in Table 4 was taken from Chatfield ([1975]).
The data contains the frequency of consumers who bought r units of brand X over a number of weeks. The data is fitted to the KGD, EEGD, and the GNBD. We are not sure how the frequencies for the (1011), (1215) and (1635) were handled in previous applications of the data set. In our analysis, the probabilities for the classes (1011) and (1215) were obtained by adding the corresponding individual probabilities in each class. When finding the maximum likelihood estimates, the probability for the last class was obtained by subtracting the sum of all previous probabilities from 1. The results in Table 4 show that all the three distributions provide adequate fit to the data. Since the parameter β in KGD is not significantly different from 1, it may be more appropriate to apply the twoparameter EEGD to fit the data. Also, the likelihood ratio test to compare the EEGD with the KGD is not significant at 5% level.
Number of absences by shiftworkers
The KGD is also applied to a data set from Gupta and Ong ([2004]), which represents the observed frequencies of the number of absences among shiftworkers in a steel industry. The data in Table 5 was originally studied by Arbous and Sichel ([1954]) in an attempt to create a model that can describe the distribution of absences to a group of people in single and doubleexposure periods. The original data contains the number of absences, xvalue, of 248 shift workers in the years 1947 and 1948. Arbous and Sichel ([1954]) used the negative binomial distribution (NBD) to fit the data. Gupta and Ong ([2004]) proposed a fourparameter generalized negative binomial distribution to fit the data and compared it to the NBD and the GPD. The chisquare value for their distribution was 8.27 with 15 degrees of freedom. The chisquare value obtained by Gupta and Ong for the GPD was 27.79 with 17 degrees of freedom (DF). The DF =ks1, where k is the number of classes and s is the number of estimated parameters.
We reanalyzed the data for the EEGD, the GNBD, and the GPD. We obtained a chisquare of 9.62 for the GPD, which is much smaller than the 27.79 provided by Gupta and Ong. Thus, our estimates from the GPD (not reported in Table 5) differ significantly from the results in Gupta and Ong ([2004]). When finding the maximum likelihood estimates, the probability for the last class was obtained by subtracting the sum of all previous probabilities from 1. In view of this, the results obtained from the EEGD are slightly different from those of Alzaatreh et al. ([2012b]) who applied the EEGD to fit the data. We apply the KGD to model the data in Table 5, and the results from the table indicate that the KGD, EEGD and GNBD provide good fit to the data.
If m→∞, the GNBD with parameters θ, m and β goes to the GPD with parameters α and λ, where α=m θ and λ=θ β on page 218 of Consul and Famoye ([2006]). We fitted the GPD to the data and we got the same loglikelihood with $\widehat{\alpha}=3.2250$ and $\widehat{\lambda}=0.6597$. From the GNBD, we obtained $\widehat{m}\widehat{\theta}=1242.96\times 0.002591=3.22\approx 3.2250=\widehat{\alpha}$ and $\widehat{\beta}\widehat{\theta}=254.77\times 0.002591=0.66\approx 0.6597=\widehat{\lambda}$.
We observe that the parameter β in the KGD is significantly different from 1. This makes the KGD a more appropriate distribution over the EEGD. The likelihood ratio statistic for testing the EEGD against the KGD is ${\chi}_{1}^{2}=2(754.62756.88)=4.52$ with a pvalue of 0.0335. Thus, we reject the null hypothesis that the data follows the EEGD at the 5% level. The likelihood ratio test supports the claim that the parameter β is significantly different from 1, and hence the KGD appears to be superior to the EEGD.
Conclusion
Discrete distributions are often derived by using the Lagrange expansions framework (see for example Consul and Famoye [2006]) or using difference equations (see for example Johnson et al. [2005]). Recently, Alzaatreh et al. ([2012b], [2013b]) developed a general method for generating distributions and these distributions are members of the TX family. The method can be applied to derive both the discrete and continuous distributions. This article used the TX family framework to define a new discrete distribution named the Kumaraswamygeometric distribution (KGD).
Some special cases, and properties of the KGD are discussed, which include moments, hazard rate and entropy. The method of maximum likelihood estimation is used in estimating the parameters of the KGD. The distribution is applied to model two real life data sets; one consisting of the observed frequencies of the distribution of purchases of a brand X breakfast cereals, and the other, the observed frequencies of the number of absences among shiftworkers in a steel industry. Two other distributions, the EEGD and the GNBD are compared with KGD. It is found that the KGD performed as well as the EEGD in modeling the observed numbers of consumers. The results also show that the KGD outperformed the EEGD in modeling the number of absences among shiftworkers. It is expected that the additional parameter offered by the Kumaraswamy's distribution will enable the use of the KGD in modeling events where the EEGD or the geometric distribution may not provide adequate fits.
Appendix
Elements of the information matrix
where,
The values of A_{ x }, B_{ x }, C_{ x }, D_{ x }, E_{ x }, and F_{ x } are given in Section 6.
Authors' contributions
The authors, viz AA, FF and CL with the consultation of each other carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.
References
 1.
Akinsete A, Famoye F, Lee C: The betaPareto distributions. Statistics 2008, 42(6):547–563. 10.1080/02331880801983876
 2.
Alexander C, Cordeiro GM, Ortega EMM, Sarabia JM: Generalized betagenerated distributions. Comput. Stat. Data Anal 2012, 56: 1880–1897. 10.1016/j.csda.2011.11.015
 3.
Alshawarbeh E, Famoye F, Lee C: BetaCauchy distribution: some properties and its applications. J. Stat. Theory Appl 2013, 12: 378–391. 10.2991/jsta.2013.12.4.5
 4.
Alzaatreh A, Famoye F, Lee C: GammaPareto distribution and its applications. J. Mod. Appl. Stat. Meth 2012a, 11(1):78–94.
 5.
Alzaatreh A, Famoye F, Lee C: WeibullPareto distribution and its applications. Comm. Stat. Theor. Meth 2013a, 42(7):1673–1691. 10.1080/03610926.2011.599002
 6.
Alzaatreh A, Lee C, Famoye F: On the discrete analogues of continuous distributions. Stat. Meth 2012b, 9: 589–603. 10.1016/j.stamet.2012.03.003
 7.
Alzaatreh A, Lee C, Famoye F: A new method for generating families of continuous distributions. Metron 2013b, 71: 63–79. 10.1007/s403000130007y
 8.
Arbous AG, Sichel HS: New techniques for the analysis of asenteeism data. Biometrika 1954, 41: 77–90. 10.1093/biomet/41.12.77
 9.
Chatfield, C: A marketing application of characterization theorem. In: Patil, GP, Kotz, S, Ord, JK (eds.)Statistical Distributions in Scientific Work, volume 2, pp. 175185. D. Reidel Publishing Company, Boston (1975). Chatfield, C: A marketing application of characterization theorem. In: Patil, GP, Kotz, S, Ord, JK (eds.)Statistical Distributions in Scientific Work, volume 2, pp. 175185. D. Reidel Publishing Company, Boston (1975).
 10.
Consul PC: Generalized Poisson Distributions: Properties and Applications. Marcel Dekker, Inc., New York; 1989.
 11.
Consul PC, Famoye F: Lagrangian Probability Distributions. Birkhäuser, Boston; 2006.
 12.
Cordeiro GM, de Castro M: A new family of generalized distributions. J. Stat. Comput. Simulat 2011, 81(7):883–898. 10.1080/00949650903530745
 13.
Cordeiro GM, Lemonte AJ: The beta Laplace distribution. Stat. Probability Lett 2011, 81: 973–982. 10.1016/j.spl.2011.01.017
 14.
Cordeiro GM, Nadarajah S, Ortega EMM: The Kumaraswamy Gumbel distribution. Stat. Methods Appl 2012, 21: 139–168. 10.1007/s102600110183y
 15.
de Pascoa MAR, Ortega EMM, Cordeiro GM: The Kumaraswamy generalized gamma distribution with application in survival analysis. Stat. Meth 2011, 8: 411–433. 10.1016/j.stamet.2011.04.001
 16.
de Santana TV, Ortega EMM, Cordeiro GM, Silva GO: The Kumaraswamyloglogistic distribution. Stat. Theory Appl 2012, 3: 265–291.
 17.
Eugene N, Lee C, Famoye F: The betanormal distribution and its applications. Comm. Stat. Theor. Meth 2002, 31(4):497–512. 10.1081/STA120003130
 18.
Famoye F, Lee C, Olumolade O: The betaWeibull distribution. J. Stat. Theory Appl 2005, 4(2):121–136.
 19.
Gupta RC, Ong SD: A new generalization of the negative binomial distribution. Comput. Stat. Data Anal 2004, 45: 287–300. 10.1016/S01679473(02)003018
 20.
Jain GC, Consul PC: A generalized negative binomial distribution. SIAM J. Appl. Math 1971, 21: 501–513. 10.1137/0121056
 21.
Johnson NL, Kemp AW, Kotz S: Univariate Discrete Distributiuons. John Wiley & Sons, New York; 2005.
 22.
Jones MC: Kumaraswamy's distribution: a betatype distribution with some tractability advantages. Stat. Methodologies 2009, 6: 70–81. 10.1016/j.stamet.2008.04.001
 23.
Kong L, Lee C, Sepanski JH: On the properties of of betagamma distribution. J. Mod. Appl. Stat. Meth 2007, 6(1):187–211.
 24.
Kumaraswamy P: A generalized probability density function for doublebounded random processes. Hydrology 1980, 46: 79–88. 10.1016/00221694(80)900360
 25.
Lemonte AJ, BarretoSouza W, Cordeiro GM: The exponentiated Kumaraswamy distribution and its logtransform. Braz. J. Probability Stat 2013, 27: 31–53. 10.1214/11BJPS149
 26.
Mahmoudi E: The beta generalized Pareto distribution with application to lifetime data. Math. Comput. Simulations 2011, 81: 2414–2430. 10.1016/j.matcom.2011.03.006
 27.
Nadarajah S, Kotz S: The beta Gumbel distribution. Math. Probl. Eng 2004, 2004(4):323–332. 10.1155/S1024123X04403068
 28.
Kotz S: The beta exponential distribution. Reliability Eng. Syst. Saf 2006, 91: 689–697. 10.1016/j.ress.2005.05.008
 29.
Singla N, Jain K, Sharma SK: The beta generalized Weibull distribution: properties and applications. Reliability Eng. Syst. Saf 2012, 102: 5–15. 10.1016/j.ress.2012.02.003
 30.
Zwillinger D, Kokoska S: Standard Probability and Statistics Tables and Formulae. Chapman and Hall/CRC, Boca Raton; 2000.
Acknowledgment
The authors are grateful to the Associate Editor and the anonymous referees for many constructive comments and suggestions that have greatly improved the paper.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Transformation
 Moments
 Entropy
 Estimation