Mittag - Leffler function distribution - A new generalization of hyper-Poisson distribution

In this paper a new generalization of the hyper-Poisson distribution is proposed using the Mittag-Leffler function. The hyper-Poisson, displaced Poisson, Poisson and geometric distributions among others are seen as particular cases. This Mittag-Leffler function distribution (MLFD) belongs to the generalized hypergeometric and generalized power series families and also arises as weighted Poison distributions. MLFD is a flexible distribution with varying shapes like non-increasing with unique mode at zero, unimodal with one / two non-zero modes. It can be under, equi or over dispersed. Various distributional properties like recurrence relation for pmf, cumulative distribution function, generating functions, formulae for different type of moments, their recurrence relations, index of dispersion, its classification, log-concavity, reliability properties like survival, increasing failure rate, unimodality, and stochastic ordering with respect to hyper-Poisson distribution have been discussed. The distribution has been found to fare well when compared with the hyper-Poisson distributions in its suitability in empirical modeling of differently dispersed count data. It is therefore expected that proposed MLFD with its interesting features and flexibility, will be a useful addition as a model for count data.


Introduction
The Poisson distribution is a popular model for count data. However its use is restricted by the equality of its mean and variance (equi-dispersion). Many models with the ability to represent under, equi and over dispersion have been proposed in the research literature to overcome this restriction. Notable among these distributions are the hyper-Poisson (HP) of Bardwell and Crow [5], generalized Poisson of Consul [8], double-Poisson of Efron et al. [12], Poisson polynomial of Cameron and Johansson [6], weighted Poisson of Castillo and Casany [7] and COM-Poisson of Conway and Maxwell [9] (see also Shmueli et al., [59]).
Of these models the HP distribution which can handle both over-and under-dispersion was first proposed by Bardwell and Crow [5] and Crow and Bardwell [10]. The probability mass function (pmf) of the HP distribution is given by  [62] studied displaced Poisson distribution which is HP distribution with the parameter  restricted to be a positive integer. The case when  is negative was investigated later by Staff [63]. The HP distribution drew attention of many researchers of late. Kemp [29] dealt with a q-analogue of the distribution and Ahmad [1] proposed a Conway-Maxwell-HP distribution. Roohi and Ahmad [53,54] investigated moments of the HP distribution. Kumar and Nair [34, 35, 36, and 37] studied various extensions and alternatives of the HP distribution. Castillo et al. [7]) studied a HP Regression Model for over-dispersed and under-dispersed count data. Best [3] and Antic et al [2] considered the HP distribution in word length and text length research. Khazraee et al. [31] investigated the application of HP generalized linear model for analyzing motor vehicle crashes.
In this paper we wish to propose a new generalization of the HP distribution by replacing ) and the normalization constant becomes , ( ) E    which is the generalized Mittag-Leffler function defined by ( Consequently the proposed distribution is called the Mittag-Leffler function distribution (MLFD). The extra parameter α adds flexibility to the model but retains computational tractability since computation of , ( ) E    does not pose a problem due to many software packages (example MATLAB) which offered routines for its quick computation The MLFD is shown to be log-concave and this confers a number of attractive properties for modeling and inference; see Walther [66] for a good review of statistical modeling and inference with logconcave distributions. The proposed MLFD should not be confused with a class of discrete Mittag-Leffler distribution proposed by Pillai and Jayakumar [51]. It is pertinent to give a brief review of some developments in statistical models involving the Mittag-Leffler function.
The Mittag-Leffler function (with 1   in (2) and is denoted by ( ) E z  ) was first introduced by Swedish mathematician Gosta Mittag-Leffler [43,44]) and it arises as the solution of a fractional differential equation. This function and its many extended versions were studied by many mathematicians over the years. Haubold, Mathai and Saxena [21] have given a good survey on the Mittag-Leffler function. Recently, this function has also been explored for applications in statistics. Pillai [50] shown that are valid cumulative distribution functions (cdf) and named it as Mittag-Leffler distribution with pdf and cdf respectively given by this distribution reduces to exponential distribution with mean1, it can be treated as a generalization of the exponential distribution. He studied different properties of this distribution. Jayakumar and Pillai [23], Jose and Pillai [26], Lin [39], Jayakumar [24], Joes et al. [27] studied different aspects of this distribution.
Pillai and Jayakumar [51] proposed a class of discrete Mittag-Leffler distribution (DML) having probability generating function (pgf) . The DML distribution arises as a mixture of Poisson with parameter   with  is a constant and  following the Mittag-Leffler distribution in (2). They have studied different properties of DML and gave a probabilistic derivation and application in a first order auto regressive discrete process. DML is also a particular case of the discrete Linnik distribution [11].
Jose and Abraham [28] introduced another discrete distribution based on Mittag-Leffler function. Their distribution is a kind of generalization of Poisson distribution in the sense that is arises when the exponential waiting time distribution in the usual Poisson process is replaced by the Mittag-Leffler distribution. The pmf of their distribution is In this article we have taken a completely different route to propose a discrete distribution based on the Mittag-Leffler function. The proposed distribution which also arises from a queuing theory set up is simple and extremely flexible in its shape and modality, it can model underdispersed, equi-dispersed and over-dispersed count data. Section 2 defines the Mittag-Leffler function distribution (MLFD) and basic structural properties are given. MLFD as a distribution in a queuing system is given in section 3. Reliability and stochastic ordering properties are discussed in section 4. Section 5 considers the computation of the generalized Mittag-Leffler function while section 6 gives examples of applications of MLFD. The conclusion is given in section 7.

Mittag-Leffler Function Distribution: Definition and Properties
Definition 1. A discrete random variable X is said to follow the Mittag-Leffler function distribution (MLFD) if its pmf is given by is the generalized Mittag-Leffler function. The distribution henceforth will be denoted by MLFD ) , , (    .

Probability recurrence relation
The MLFD ) , , (    pmf in (4) has a simple recurrence relation given by ( When  a positive integer, (5) can be expressed as . The distribution exhibits longish tail for 1 0    as the ratio of successive probabilities varies slowly (this corresponds to over dispersion) as k tends to infinity while for 1   this ratio tends to zero faster implying presence of a Poisson type tail.
The above recurrence relation facilitates easy computation of the probabilities. The computation of the normalizing constant ) ( is only required for ( 0) P X  . Note that the recurrence relation (or the difference equation) in (5) reduces respectively to that of Hyper-Poisson [5] and displaced Poisson distribution [62] an integer (further discussed in see section 2.4).

Shapes of probability mass function
is plotted for a number of combinations of parameters to study the different shapes of the distribution.      From the plots of the pmf it is seen that the distribution can be unimodal with nonzero mode (see Fig. 1) or it can have two nonzero modes (see Fig. 8) or non-increasing with the mode at 0 (see Fig. 7) (see section 4 for further discussion on modes).

Dependence of the pmf on the parameters:
The proportion of zeros given by Whereas when   0  the proportion zero decreases (can be seen in the Fig. 2) with the pmf proportional to

Cumulative distribution function and generating functions
The cumulative distribution function (cdf) of MLFD ) , , (Haubold et al. [21]). as: The moment generating function (mgf) ) (s M X and the factorial moment generating function (fmgf) are respectively given by

Related distributions and connections with other families of distributions 2.4.1 Particular cases of MLFD
includes a number of well-known distributions as particular case: Since for reduces to the HP ) , (   distribution (Bardwell and Crow [5], p. 200 Johnson et al. [25]) is the confluent hypergeometric function. Hence An alternative form of the above pmf can be seen as reduces to the displaced Poisson distribution (see Staff [62], p. 200 of Johnson et al., [25]) with parameter  and t.
In addition the following new distributions involving hyperbolic, and error functions are also seen as particular cases.
reduces to a new distribution with parameter  and pmf can be viewed as a continuous bridge between geometric ( 0   ) and hyper-Poisson can be viewed as a continuous bridge between geometric ( 0   ) and Poisson ( 1   ) distributions in the range of the parameter . This property is also shared by the Com-Poisson distribution [9].

MLFD as weighted distributions
then for integer  it can be shown that weighted distribution with weight function , it can be shown that weighted distribution with weight function Then for integer and  it can be shown that weighted distribution with weight function

MLFD as member of some families of discrete distributions
is a member of the generalized hypergeometric family (Kemp [ 30], see Johnson et al. [25] for details). This is clear from equation (6).
is a member of the generalized power series distribution (Patil [46,47]) when  is the primary parameter. The GPSD has the pmf of the form iii. For fixed value of the parameter  and  the MLFD is also a member of the exponential family of distribution To see this consider the likelihood for a set of n independent and identically distributed observations the likelihood function is given by iv. When the parameter  is integer, the pgf of MLFD can be represented in terms of confluent hypergeometric function as belongs to generalized hypergeometric family of Kemp [30].

Moments and related results
Denoting with respect to  given by the following formulae can be derived: The variance can also be expressed as The above results can alternatively be derived easily by first deriving 2 , and then simplifying.
Caution: In all the above expressions there is restriction on the values of  . This situation may be overcome by using the following relation repeatedly till the conditions are satisfied.

Alternative formulae for moments:
To avoid the complicacy that might arise in the above formulations alternative simple formulae for moments in terms of generalized ML function of Prabhakar [52] may be presented as follows: In general,

General formulae for the r th factorial moment:
Following equations (6.5) and (6.8) of Janardan [22] the following results can be written easily is the forward difference operator with interval of differencing being equal to '1'and

Recurrence relations of moments:
The following recurrence relations hold: The relations (i) to (iii) can be proved using the general relations for GPSD or by direct manipulation while (iv) follows from difference equation in (5).

Monotonicity of the mean:
Since is a monotonically increasing function of  .

A useful moment identity:
The following useful moment identity for MLFD ) , , can be derived following Kattumannil and Tibilet [32]: respectively.

Index of dispersion The index of dispersion (ID) is given by ID = Variance / Mean. Contour plots of ID for different choices of parameters
fixed are presented in Fig. 9  and 10 to depict the contours of ID. Labels on a given line indicate the value of ID on that line. In these figures, the ID of the region on the left (right) of a given line is more (less) than the ID value on that given line. From the Fig. 9 and 10, it is obvious that the MLFD ) , , (    is very flexible with respect to the ID and able to cater for under, equi and over dispersion in count data. Interestingly, this family includes a non Poisson distribution with equi-dispersion. We next consider a theorem to state the conditions under which the distribution is equi, under or over dispersed.
where a and b are arbitrary constants and 0 the ID is given by That is, Thus it follows that ,  , c (> 0) and b are arbitrary constants.

Numerical verification of Theorem 1:
We have verified the results of the above theorem by plotting against  for some pairs of values of  and  taken from line of equidispersion in the contour plots in Fig. 9 and 10. Here few representative plots are presented in Fig. 11 , like the COM-Poisson distribution [9], can be derived as the probability of the system being in the k-th state for a queuing system with state dependent service rate. Consider a queuing system with Poisson inter arrival times with parameter  , first come first serve policy, and exponential service times that depend on the system state (nth state means n number of units in the system). The mean service time in the nth state is 1 , , where,  / 1 is the normal mean service time for a unit when that unit is the only one in the system and  is the pressure coefficient, a constant reflecting the degree to which the service rate of the system is affected by the system. For the sake of completeness, the proof that the probability is the MLFD ) 1 , , (   pmf is given as follows: Following Conway and Maxwell (p. 134-35 [9]), the system differential difference equations are ) . This implies, Assuming a steady state (i.e. 0 ) ( . Similarly, from (9), we get

Reliability and stochastic ordering
Discrete life time model has of late been favourite subject matter of many studies as many times the life of a system is may be measured by counting and even when the life is measured in a continuous scale the actual observations may be recorded in a way making discrete model more appropriate. It is therefore important to study the reliability properties of the proposed discrete distribution. Stochastic ordering is a closely related important area that has found applications in many diverse areas such as economics, reliability, survival analysis, insurance, finance, actuarial and management sciences (see Shaked and Shanthikumar, [58]). In this section we study the reliability properties and stochastic ordering of the MLFD ) , ,

Failure rate function:
has a log-concave probability mass function since for this distribution (Gupta et al., 1997) The following results are direct consequence of log-concavity (Mark, 1996): is a strongly unimodal distribution due to the log-concavity of its pmf (see Steutel [64]).
, the result follows.
has a non increasing pmf with a unique mode at 0 (See the pmf plots in Fig. 7 for some choices of ) , , Fig. 8 for some choices has at most an exponential tail. iv. MLFD ) , , with any other discrete distribution will also result in log concave distribution.

Stochastic ordering with HP:
The following result stochastically compares the MLFD ) , , with the HP ) , (   by using the likelihood ratio order.  [58] and Gupta et al., [19]). Hence the result is proved.
distribution in the likelihood ratio order i.e. X Y lr  .

Corollary:
Again Y X lr  implies X is smaller than Y in the hazard rate order and subsequently in the mean residual (MRL) life order (see Gupta et al.[19]). Symbolically, implies Y is smaller than X in the hazard rate order and subsequently in the mean residual (MRL) life order (see Gupta et al. [19]). Symbolically,

Computation of the generalized Mittag-Leffler function
Numerical computation of the generalized Mittag-Leffler function is well-researched. Seybold and Hilfer [57] gave a numerical algorithm for calculating the generalized Mittag-Leffler function for arbitrary complex argument z and real parameters  and  based on a Taylor series, exponentially improved asymptotics and integral representation. If 1 z  , a simple way to (see Lee et al., [38]). This avoids the computation of the gamma function .The summation is terminated when k a is very small. The error estimate is given by Theorem 4.1 of Seybold and Hilfer [57] which determines the number of terms N such that , 0 For other values of z, asymptotic series and integral representation (Seybold and Hilfer [57], equations (2.3), (2.4) and (2.7) respectively) are employed. Error estimates are also given for these cases. The computation of the Mittag-Leffler function is given by many software packages like Matlab (MLF (alpha, Z, P)) and Mathematica ( MittagLefflerE[a, b, z]). (See also Gorenflo et al., [17]).

Data analysis with
Numerical optimization method is used to obtain the maximum likelihood estimates of the parameters required for the data fitting experiments and the likelihood ratio test. The estimates of the variances and covariances of the mles can be obtained from the inverse of the Fishers information matrix.

Likelihood ratio test
The

Numerical examples
Three data sets from the literature have been considered here. The first set is the frequency data (Table 1.1) on insurance claims and incapacity caused by sickness or accident (Lundberg [40] also used by Phang, et. al. [48]) is over dispersed with index of dispersion 2.28255, the second set (Table 2.1) is the frequency data on the number of sickness absences (1955-1964) Taylor [65] is an over dispersed data with index of dispersion 5.38588, and the third set ( distributions. Asymptotic variance and covariance of the MLEs are estimated from the inverse of the information matrix. The performances of various distributions are compared using the log likely hood value and the AIC (Akaike Information Criterion) defined as AIC = -2 ln L + 2k, where k is the number of parameter(s) and ln L is the maximum of log-likelihood for a given data set (Burnham and Anderson [4]

Conclusion
A new generalization of the hyper-Poisson distribution which is a continuous bridge between geometric and hyper-Poisson is derived using the generalized Mittag Leffler function. Some known and new distributions are seen as particular cases of this distribution. This new generalization belongs to the generalized power series, generalized hyper geometric families and also arises as weighted Poisson distributions. Like the hyper-Poisson, COM-Poisson and generalized Poisson distributions, this distribution is also able to model under, equi-and over dispersion. Although the new generalization of the hyper-Poisson distribution has an extra parameter it is computational not more complicated than the hyper-Poisson since it retains the two-term probability recurrence formula and the normalizing constant, in terms of the generalized Mittag Leffler function is readily computed. It has many interesting probabilistic and reliability properties and is found to be a better empirical model than the hyper-Poisson distribution.