New class of Lindley distributions: properties and applications

A new generalized class of Lindley distribution is introduced in this paper. This new class is called the T-Lindley{Y} class of distributions, and it is generated by using the quantile functions of uniform, exponential, Weibull, log-logistic, logistic and Cauchy distributions. The statistical properties including the modes, moments and Shannon’s entropy are discussed. Three new generalized Lindley distributions are investigated in more details. For estimating the unknown parameters, the maximum likelihood estimation has been used and a simulation study was carried out. Lastly, the usefulness of this new proposed class in fitting lifetime data is illustrated using four different data sets. In the application section, the strength of members of the T-Lindley{Y} class in modeling both unimodal as well as bimodal data sets is presented. A member of the T-Lindley{Y} class of distributions outperformed other known distributions in modeling unimodal and bimodal lifetime data sets.


Introduction and motivation
The Lindley distribution was first introduced as a one scale parameter distribution by Lindley (1958). In the recent years, researchers have given Lindley distribution a special attention for its importance in modelling complex real lifetime data. Some researchers went in the track of studying the Lindley distribution and its properties in more details. Ghitany et al. (2008) studied some properties of the one parameter Lindley distribution, and in the application part, they showed that it is more flexible and works better in modelling lifetime data than the known exponential distribution. Other researchers have introduced more flexible generalizations of Lindley by compounding Lindley with other well-known distributions. A two parameters extension of Lindley distribution was investigated by Ghitany et al. (2011), Nadarajah et al. (2011, Shanker et al. (2013), and Shanker et al. (2017). More recently, another two-parameter Lindley distribution was introduced by Dey et al. (2019), which provides a better fit to skewed real data than the inverse Lindley distribution introduced by Sharma et al. (2015). With comparison to Weibull distribution, Arslan et al. (2017) proposed the use of Generalized Lindley distribution introduced by Nadarajah et al. (2011) as an alternative to the Weibull distribution when (2021) 8:11 Page 2 of 22 modeling wind speed data. A notable amount of attention in the literature is given to the three-parameter Lindley distribution generalization. Many three-parameter generalizations have been defined, analyzed and presented as a competitive models to well-known distributions (From these three-parameter generalization; the one proposed by Zakerzadeh and Dolati (2009), Elbatal et al. (2013), and another three-parameter Lindley was introduced by Ashour and Eltehiwy (2015), which was extended by the exponentiation of Lindley distribution. The various three-parameter Lindley generalizations defined over the past decade assembled strength and flexibility in modelling the different shapes of lifetime data. As a result, less interest was given to studying Lindley generalization with more than three parameters. One of the few four-parameter generalizations of Lindley distribution is named the beta-generalized Lindley distribution, and it was proposed by Oluyede and Yang (2015). Generalizing distributions mainly depends on adding more flexibility to known distributions which result from implanting a basic distribution into more capable structure. The literature of Distribution Theory is full of different techniques to generalize continuous distributions to enhance their abilities in modeling real world data. Lee et al. (2013) discussed the different methods for generating distributions with more details.
In this paper, we use the transform-transformer framework (T-X class) introduced by Alzaatreh et al. (2013) to generalize the one parameter Lindley distribution, and named it the T-Lindley{Y } class of distributions. Alzaatreh et al. (2014) refined the T-X class by defining the T-R{Y } framework. The T-R{Y } method can be briefly defined as follows. Let T, R and Y be random variables with the respective CDFs F T (x) = P (T ≤ x) , F R (x) = P (R ≤ x) , and F Y (x) = P (Y ≤ x). The PDFs of T, R and Y are f T (x), f R (x), and f Y (x), respectively. Define the quantile function of the random variable Y as Q Y (p) = inf{y : F Y (y) ≥ p}, 0 < p < 1. The CDF and the PDF of the random variable X, following a T-R{Y } family of distributions, are respectively defined as (1) Generalizing distributions using the T-R{Y } framework involves adding more parameters to the generalized distribution. Hence there is more flexibility in modeling lifetime data. In the recent years, many new classes of distributions using the T-R{Y } framework were introduced as a generalization to known distributions; Alzaatreh et al. (2014) used this technique to define the T-normal{Y } family of distributions as a generalization to the normal distribution. Hamed et al. (2018) introduced a generalization for Pareto distribution using the transform-transformer framework. Alzaghal et al. (2013) proposed an exponentiation to the T-R{Y } family of distributions by adding an extra parameter to the random variable T.
In this paper, the T-R{Y } framework was used to generalize the Lindley distribution. The main motivation for using this frame, is to extend the characteristics of the baseline Lindley model to fit different shapes of data including left skewed, symmetric and bimodal. Moreover, to provide better fit than other distributions with the same or more numbers of parameters when modeling real world data sets.
(2021) 8:11 Page 3 of 22 The rest of the paper is structured as follows: In Section 2, the definition of the T-Lindley{Y } class of distributions, and six different subclasses of theT-Lindley{Y } are proposed. Some statistical properties of this new class of distributions such as modes, moments, and Shannon's entropies are investigated in Section 3. In Section 4, some new members of this new class are introduced and studied in more details. The maximum likelihood estimation method is used to estimate the parameters of the normal-Lindley{Cauchy} distribution and a simulation study is performed in Section 5. The flexibility of this new class of distributions in fitting four different shapes of data is illustrated in Section 6. Finally, a brief conclusion of this paper is given in Section 7.

The T-Lindley{Y} class of distributions
The cumulative distribution function (CDF) and probability density function (PDF) of the one parameter Lindley distribution are, respectively, given by Using Eq. (1) with F R (x) to be the CDF defined in Eq. (3), the CDF and PDF of the random variable X following the general T-Lindley{Y } class of distributions are, respectively, given as and Table 1 provides the six different quantile functions that are used in generating six different subclasses of the T-Lindley{Y } class of distributions. The different subclasses of T-Lindley{Y } class introduced in this paper are different generalized classes of Lindley distribution with a maximum of three parameters.

New T-Lindley{Y} subclasses of distributions
Using the different quantile functions listed in Table 1, six new subclasses of the T-Lindley{Y } are defined in this subsection.

T-Lindley{uniform} class of distributions
By using the quantile function of the uniform distribution, Q Y (p) = p, the corresponding CDF to (4) is  and the corresponding PDF to (5) is

T-Lindley{exponential} class of distributions
By using the quantile function of the exponential distribution, Q Y (p) = −b log(1 − p), the corresponding CDF to (4) is and the corresponding PDF to (5) is where respectively. Therefore, the T-Lindley{exponential} class of distributions arises from the hazard function of the Lindley distribution.

T-Lindley{Weibull} class of distributions
By using the quantile function of the Weibull distribution, Q Y (p) = β( − log(1 − p)) 1/α , the corresponding CDF to (4) is and the corresponding PDF to (5) is Note that, if α = 1 in Eq. (8), then the PDF of the T-Lindley{Weibull} class of distributions reduces to the PDF of the T-Lindley{exponential} class of distributions.

T-Lindley{Log-logistic} class of distributions
By using the quantile function of the log-logistic distribution, Q Y (p) = a(p/(1 − p)) 1/b , the corresponding CDF to (4) is and the corresponding PDF to (5) is Note that, if a = b = 1, then the family of distributions in Eq. (10) arising from the odds of the Lindley distribution and it is given by

T-Lindley{Logistic} class of distributions
By using the quantile function of the logistic distribution,Q Y (p) = a + b log (p/(1 − p)), the corresponding CDF to (4) is and the corresponding PDF to (5) is If a = 0 and b = 1, then the family of distribution in Eq. (11) arising from the logit function of the Lindley distribution and it is given by

T-Lindley{Cauchy} class of distributions
By using the quantile function of the Cauchy distribution, Q Y (p) = tan(π(p −0.5)), the corresponding CDF to (4) is and the corresponding PDF to (5) is

SOME structural properties of the T-Lindley{Y} class of distributions
In this section, some structural properties of the new proposed class of distributions is discussed in details. Proofs are not provided for obvious results.
Lemma 1 Let T be a random variable with PDF f T (x), then the random variable X = Q R (F Y (T)) follows the T-Lindley{Y} class of distributions, where Q R (·) is the quantile function of Lindley distribution. As a result, X can be simplified to (θ+1) and W −1 denotes the negative branch of the Lambert W function. For more details about the negative branch of the Lambert function; see Lazri and Zeghdoudi (2016).
The importance of Lemma 1 is that it shows the relationship between the random variable X and the random varaiable T. As an example, we can generate the random variable X (2021) 8:11 Page 6 of 22 that follows the T-Lindley{Cauchy} distribution in Eq. (12) by first simulating the random variable T from the PDF f T (x) and then computing X = K W −1 ((0.5 − (arctan T)/π)), which has the CDF F X (x) Lemma 2 If Q X (p), 0 < p < 1 denote a quantile function of the random variable X. Then, the quantile function for T-Lindley{Y} class is given by Q

Theorem 1 The mode(s) of the T-Lindley{Y} class are the solutions of the equation
. The equation to be solved to find the mode(s) of f X (x) can be obtained by solving the equation R(x) = 0.

Hamed and Alzaghal Journal of Statistical Distributions and Applications
(2021) 8:11 Page 7 of 22 In Section 4, the normal-Lindley{Cauchy} distribution is an example of a bimodal distribution, which means that Corollary 3 (vi) could have more than one solution to represent a bimodal distribution.
The entropy of a random variable X is a measure of variation of uncertainty. Entropy has several applications in information theory, physics, chemistry and engineering. The Shannon's entropy for a continuous random variable (Shannon 1948).

Theorem 2 The Shannon's entropy for the T-Lindley{Y} class is given by
where, η T is the Shannon's entropy for the random variable T and μ X is the mean of the random variable X.
Proof By the definition of the Shannon entropy, Using the fact that the random variable T = Q y {F R (X)} for the T-Lindley{Y } class, the η X can be written as

Theorem 3 The r th non-central moments for the T-Lindley{Y} class of distributions are given by
, and c n,r = r j (−(θ + 1)) r−j .
Proof From lemma 1, By using the binomial expansion and the series expansion for the Lambert W −1 function, whenever |z| < 1/e, the X r can be written as where c n and a n defined in the statement of Theorem 3. Therefore, Gradshteyn and Ryzhik (2007), where b n,r can be obtained from the recurrence relation defined in Theorem 3.

Corollary 5 Based on Theorem 3, the r th non-central moments for the (i) T-Lind-ley{uniform}, (ii) T-Lindley{exponential}, (iii) T-Lindley{Weibull}, (iv) T-Lindley{log-logistic}, (v) T-Lindley{logistic}, and (vi) T-Lindley{Cauchy} classes of distributions, respectively, are given by
The next theorem is about the mean deviation from the mean, D(μ), and the mean deviation from the median, D(M), for the T-Lindley{Y } class of distributions. where μ and M are the mean and median for X, and

Theorem 4 The D(μ) and D(M) for the T-Lindley{Y} class of distributions, respectively, are given by
Proof For a nonnegative random variable X, By using the series expansion of Lambart W function given in Eq. (14), Q R (·) can be written as where a n = n n−2 (n−1)! (θ + 1) n e −n(θ+1) . In turn, implies the result in Theorem 4.

Some members of the T-Lindley{Y} class of distributions
In this section, three new distributions of the class of T-Lindley{Y } are studied. The first is a member of the T-L{E} subclass, the second is a member of the T-L{LL} subclass, and the last one is a member of the T-L{C} subclass.
In Fig. 1, various plots of the W-L{E} are provided for different values of the parameters θ, α, and β. The graphs show that the W-L{E} can be unimodal with monotonically decreasing (reversed J-shape), skewed to the right, symmetric, or skewed to the left.
Using the general properties of the T-Lindley{Y } class of distributions derived in Section 2, the following properties of the W-L{E} distribution are obtained: (i) The Quantile function: By using Corollary 2 part (ii), the quantile function of the W-L {E } is given by (ii) Mode: By using corollary 3 part (ii), the mode of W-L {E } distribution is the solution of the following equation which can be evaluated numerically The r th non-central moments: By using Corollary 5 part (ii), the r th non-central moments of W -L {E } distribution are given by (iv) The Mean deviations: By using Theorem 4 and Corollary 6 part (ii), the mean deviation from the mean and the mean deviation from the median of W -L {E } are given by where I q is given by

The exponential-Lindley{Log-logistic} distribution
Let the random variable T follows the exponential distribution with parameter γ and with the CDF F T (x) = 1 − e −x/γ . Using Eq. (9), the CDF of the exponential-Lindley{Log-Logistic} (E-L{LL}) distribution is defined as With α = a/γ and using Eq. (10), the PDF of the E-L{LL} distribution is given by

The Normal-Lindley{Cauchy} distribution
Let the random variable T follows the normal distribution with parameters μ and σ , then the CDF and the PDF of Using Eq. (12), the CDF of the normal-Lindley{Cauchy}(N-L{C}) distribution is defined as Using Eq. (13), the PDF of the N-L{C} distribution is given by

Estimation and simulation for the parameters of the N-L{C} distribution
In this section, the unknown parameters of the N-L{C} distribution are estimated using the maximum likelihood (ML) estimation method. Then, a simulation study to assist the performance of the maximum likelihood estimates (MLEs) is presented.
for the parameters θ, μ, and σ are derived analytically as respectively. By setting the equations U μ = 0 and U σ = 0, the MLEs ofμ andσ are given byμ Hence, we first maximize the log-likelihood function with respect to θ, which gives the MLE ofθ , then substituteθ into Eq. (16) to find the MLEμ for the parameter μ, and substituteθ andμ into Eq. (17) to find the MLEσ for the parameter σ . The SAS software was used to run all the needed analysis. The initial value for the parameter θ is obtained by assuming the random sample x i , i = 1, 2 . . . , n is from Lindley distribution with parameter θ.

Simulation
A simulation is used to investigate the performance of the MLEs for the parameters' of the N-L{C} distribution. To generate a random sample form N-L{C}, we first generate a random sample (t i , i = 1, 2 . . . , n) from the normal distribution with parameters μ and σ , then apply the transformation in Corollary 1 part (vi); X = K W −1 ((0.5 − (arctan T)/π)), the resulting random sample will follow the N-L{C} class. Five different sample sizes are considered (n = 25, 50, 100, 200, 500) with six different combinations of parameters (θ = 0.5, 1.5, 2, μ = 0, 0.5, 1, 2, σ = 0.5, 1.5, 2). For each parameters' combination and each sample size, the simulation process is repeated 500 times. Table 2 gives the average biases (actual-estimated) and the standard deviations of θ ,μ andσ . It can be concluded from the table that the efficiency of the ML estimation method increased with the increase of the sample size where the bias and the standard deviation got smaller. Table 2 shows that the ML estimation method is an appropriate technique for estimating the parameters of the N-L{C} distribution. Similar estimation analysis was conducted

Applications of some T-Lindley{Y} distributions
In this section, the applicability of W -L{E}, E-L{LL} and N-L{C} as members of the T-Lindley{Y } class of distributions in modeling real data set is presented. Four different data sets including right skewed, left skewed, symmetric and bimodal shapes are considered. The flexibility of the T-L{Y } members are compared with other well-known distributions The computations and the statistical analysis for the different applications were done using the SAS software. For each one of the application, the ML estimation method is used to estimate the parameters of the fitted distributions. The initial value for the parameter θ for the W -L{E} and E-L{LL} distributions is obtained in a similar manner to the one used for the N-L{C} and the initial value for the rest of their parameters is set to 1.
To compare the different fitted models, the following goodness of fit tests were carried: the value of two times the minus log-likelihood function −2 log l, Akaike information criterion (AIC), Bayesian information criterion (BIC), Kolmogorov-Smirnov (K-S) and its corresponding p-value. We have also considered the Anderson-Darling (A * ) and Cramér-von Mises (W * ), see Chen and Balakrishnan (1995) for details regarding these statistics. In general, the smaller the value of any of the goodness fit test correspond to a better fit for the data. Except for the p-value of the K-S test, the higher the p-value the better the fit. The first three applications illustrate the different shapes of a unimodal data sets; right skewed, symmetric and left skewed. The fourth and last application represents the modeling of a bimodal data set.

The United Kingdom quarterly gas consumption between the years 1960-1986
In this first application, the quarterly logged demand for gas in the United Kingdom between the years 1960 -1986 is used. The gas consumption data set is heavily skewed to the right. Tahir et al. (2016) fitted this data using the Weibull-Dagum distribution (WD) defined based on the Weibull-G class that is proposed by Bourguignon et al. (2014). The WD flexibility in modeling this data set was compared to the one of Beta-Dagum distribution (BD) introduced by Domma and Condino (2013). The BD distribution is a sub-distribution of the beta-G class presented by Eugene et al. (2002).The five parameter DB with the additional two positive shape parameters, placed second in fitting this data set compared to the WD. Tahir et al. (2016) tests' results for DW and BD are included in Tables 3 and 4. To examine the flexibility of the three members of T-L{Y } class of distributions with three parameters in fitting this data set, another two competitive Lindley generalized distributions with three parameters were used in this comparison. The first one is the Beta Lindley distribution (BL) proposed by Merovci and Sharma (2014), and the second one is the Generalized Lindley distribution (GL) due to Zakerzadeh and Dolati (2009). On examining the results in Tables 3 and 4, we observe that all the distributions specifically the T-L{Y } members give an adequate fit to the data. However, the three-parameter N-L{C} distribution provides the smallest −2 log l, AIC, BIC, K-S, A * and W * values and the highest K-S p-value compared to the other competing six distributions. This put the N-L{C} in top at fitting this skewed right data among all the considered models including the four-parameter WD and the five-parameter BD distributions.
(2021) 8:11 Page 16 of 22  Figure 6 displays the histogram and the fitted density functions for the UK gas consumption data set.

The annual maximum temperatures at England cities
The second data set is an approximately symmetric data set with 80 observations and is about the annual maximum temperatures recorded in Oxford and Worthing at England between the years 1901-1980. The data was first analyzed by Chandler and Bate (2007) and recently, Alzaatreh et al. (2015) used members of the Weibull-gamma{Y } family to fit this data set. The results of the four parameter Weibull-gamma{exponential} (W -G{E}) and Weibull-gamma{log-logistic} (W -G{LL}) were included in this study. In addition to that, the flexibility of W -L{E}, E-L{LL} and N-L{C} distributions in fitting this data set were compared to the performance of the BL and GL distributions (see application one).
With the lowest AIC, BIC and A * values, the three-parameter distributions N-L{C} provides a good fit to this data set compared to the other competing distributions. With the lowest W * and K-S test values and the highest p-value for the K-S test, the four-parameter W -G{LL} is also providing a good fit for this data set. But, with one less parameter and very similar values of the K-S and W * tests, the N-L{C} (once again) is considered the best in fitting this data set. See Tables 5 and 6.
The histogram of the annual maximum temperature and the fitted PDFs of W -L{E}, N-L{C}, W -G{E} and W -G{L} distributions are presented in Fig. 7.

Time to AIDS
The third data set is about the times in years to infection with AIDS for 295 patients. Those patients were infected with AIDS virus from a contaminated blood transfusion, and the time in years it took each one of them to develop the AIDS was measured from the date of infection. This data is taken from Klein and Moeschberger (1997). Recently, Weibull Lindley distribution (WL) due to Asgharzadeh et al. (2018) and the extended Lindley distribution (EL) due to Bakouch et al. (2015) were used to fit this left skewed data set by Asgharzadeh et al. (2018). Both of these Lindley generalizations provided an adequate fit to this data set using multiple measures. In this compression, The same T-L{Y } members used in the previous applications; W -L{E},E-L{LL} and N-L{C} were fitted to this data in addition to the BL,GL, WL and the EL distributions defined earlier. The parameter estimates, and the various goodness of fit measures for these seven distributions in fitting this data set are recorded in Tables 7 and 8. It is obvious from the goodness of fit measures that the three-parameter T-L{Y } members compete well with the other distributions. But, based on all of the used goodness of fit measures the E-L{LL} rank first  in fitting this data set by providing the lowest test values provided in Tables 7 and 8. While the WL and EL distributions provided an adequate fit, they still ranked second and third in fitting this data sets with the second and third lowest goodness of tests' values. This application illustrates the flexibility of the E-L{LL} distribution in fitting a left skewed data set compared to other well-known Lindley generalizations. In Fig. 8, the histogram of the time to AIDS data set and the fitted PDFs are presented. The three previous applications show the flexibility of W -L{E}, E-L{LL} and N-L{C} members of the T-L{Y } class of distributions in fitting the different shapes assembly by unimodal data set well including skewed left, skewed right as well as symmetric. The fourth and last application illustrates the flexibility of the N-L{C} distribution, a member of the T-L{Y } class of distributions, in fitting a bimodal data set.

Times to death of psychiatric patients
The bimodal data set is about the times to death of twenty six psychiatric patients admitted to the University of Iowa hospital during the period 1935-1948. This data set is taken from Klein and Moeschberger (1997). Recently, Alzaghal and Hamed (2019) analyzed this data using the bimodal normal-Lomax{Cauchy} distribution (N-Lo{C}). Tables 9 and 10 provide the parameter estimates, and the different goodness of fit tests' results for the different distributions included in this comparison. The performance of the three-parameter N-L{C} distribution in modeling this data set was compared to the following bimodal competitive distributions: the four-parameter beta-normal distribution (BN) defined by Famoye et al. (2004), the four-parameter Weibull-gamma{log logis-tic} distribution (W -G{LL}) introduced by Alzaatreh et al. (2015), the three-parameter logistic-normal{logistic} distribution (L-N{L}), a member of the T-normal{Y } introduced  by Alzaatreh et al. (2014), and the four-parameter N-Lo{C}. Finally, the three-parameter WL distribution was also fitted to this data set. The N-L{C} distribution with only threeparameter fitted this data well with the smallest AIC, BIC, K-S statistics and with the highest K-S p-value. The four-parameter N-Lo{C} distribution got the smallest A * and W * tests' values, and the same −2 log l test value as the N-L{C} making it a strong competitive to the N-L{C} distribution in fitting this data set. But, with one less parameter the N-L{C} is providing a superior fit to this data set. Comparing only the three-parameter distributions applicability in fitting this data set, the WL distribution rank second with the second smallest AIC, BIC, K-S, A * , W * tests' values and with the second largest pvalue based on the K-S statistic after the N-L{C} distributions. In Fig. 9, the bimodality of the data set is clearly captured by the N-L{C} distribution showing the superiority of the N-L{C} in fitting this data set followed by the N-Lo{C} fit.

Conclusion
In this paper, we proposed a new class of distributions, so-called the T-Lindley{Y } class of distributions. This new Lindley generalization is based on the T-R{Y } methodology. The T-Lindley{Y } class of distributions have a variety of shapes varying between unimodal and bimodal. Therefore, members of this class can effectively be used in analyzing unimodal as well as bimodal real-world data as presented in the application section. Different statistical properties of the new proposed class of distributions are investigated. Six new subclasses based on the quantile functions of uniform, exponential, Weibull, log-logistic,    9 Fitted PDFs for the times to death of psychiatric patients' data set logistic and Cauchy are introduced. Three members from three different subclasses are studied in more details. A simulation analysis is carried to study the performance of the maximum likelihood estimation method in estimating the unknown parameters of the three-parameter N-L{C} distribution. In the application section, the N-L{C} distribution shows a superiority in fitting three out of the four data fitted in comparison to other known distributions.