- Research
- Open access
- Published:
The Marshall-Olkin extended Weibull family of distributions
Journal of Statistical Distributions and Applications volume 1, Article number: 9 (2014)
Abstract
We introduce a new class of models called the Marshall-Olkin extended Weibull family of distributions based on the work by Marshall and Olkin (Biometrika 84:641–652, 1997). The proposed family includes as special cases several models studied in the literature such as the Marshall-Olkin Weibull, Marshall-Olkin Lomax, Marshal-Olkin Fréchet and Marshall-Olkin Burr XII distributions, among others. It defines at least twenty-one special models and thirteen of them are new ones. We study some of its structural properties including moments, generating function, mean deviations and entropy. We obtain the density function of the order statistics and their moments. Special distributions are investigated in some details. We derive two classes of entropy and one class of divergence measures which can be interpreted as new goodness-of-fit quantities. The method of maximum likelihood for estimating the model parameters is discussed for uncensored and multi-censored data. We perform a simulation study using Markov Chain Monte Carlo method in order to establish the accuracy of these estimators. The usefulness of the new family is illustrated by means of two real data sets.
Mathematics Subject Classification (2010)
60E05; 62F03; 62F10; 62P10
1 Introduction
The Weibull distribution has assumed a prominent position as statistical model for data from reliability, engineering and biological studies (McCool 2012). This model has been exaustively used for describing hazard rates – an important quantity of survival analysis. In the context of monotone hazard rates, some results from the literature suggest that the Weibull law is a reasonable choice due to its negatively and positively skewed density shapes. However, this distribution is not a good model for describing phenomenon with non-monotone failure rates, which can be found on data from applications in reliability and biological studies. Thus, extended forms of the Weibull model have been sought in many applied areas. As a solution for this issue, the inclusion of additional parameters to a well-defined distribution has been indicated as a good methodology for providing more flexible new classes of distributions.
Marshall and Olkin (1997) derived an important method of including an extra shape parameter to a given baseline model thus defining an extended distribution. The Marshall and Olkin (“” for short) transformation furnishes a wide range of behaviors with respect to the baseline distribution. The geometrical and inferential properties associated with the generated distribution depend on the values of the extra parameter. These characteristics provide more flexibility to the generated distributions. Considering the proportional odds model, Sankaran and Jayakumar (2008) presented a detailed discussion about the physical interpretation of the family.
This family has a relationship with the odds ratio associated with the baseline distribution. Let X be a distributed random variable which describes the lifetime relative to each individual in the population with a vector of p-covariates z=(z1,…,z p )⊤, where (·)⊤ denotes the transposition operator. Then, the cumulative distribution function (cdf) of X is given by
where k(z)=λ G (x)/λ F (x ; z) is a non-negative function such that z is independent of the time x, λ F (x ; z) is the proportional odds model [for a discussion about such modeling, see Sankaran and Jayakumar (2008)] and represents an arbitrary odds for the baseline distribution.
In this paper, we consider k(z)=δ. Before, however, it is important to highlight two important properties of the transformation: (i) the stability and (ii) geometric extreme stability (Marshall and Olkin 1997). In other words, the distribution possesses a stability property in the sense that if the method is applied twice, it returns to the same distribution. In addition, the following stochastic behavior can also be verified: let {X1,…,X N } be a random sample from the population random variable equipped with the survival function (1) at k(z)=δ. Suppose that N has the geometric distribution with probability p and that this quantity is independent of X i , for i=1,…,N. Then, U=m i n(X1,…,X N ) and V=m a x(X1,…,X N ) are random variables having survival functions (1) such that k(z) can be equal to p and p−1, respectively, i.e., the transform satisfies the geometric extreme stability property.
Due to these advantages, many papers have employed the transformation. In Marshall and Olkin work, the exponential and Weibull distributions were generalized. Subsequently, the extension was applied to several well-known distributions: Weibull (Ghitany et al.2005, Zhang and Xie 2007), Pareto (Ghitany 2005), gamma (Ristić et al.2007), Lomax (Ghitany et al.2007) and linear failure-rate (Ghitany and Kotz 2007) distributions. More recently, general results have been addressed by Barreto-Souza et al. (2013) and Cordeiro and Lemonte (2013). In this paper, we aim to apply the generator to the extended Weibull () class of distributions to obtain a new more flexible family to describe reliability data. The proposed family can also be applied to other fields including business, environment, informatics and medicine in the same way as it was originally done with the Birnbaum-Saunders and other lifetime distributions.
Let and g(x)=d G(x)/d x be the survival and density functions of a continuous random variable Y with baseline cdf G(x). Then, the extended distribution has survival function given by
where
Clearly, δ=1 implies . The family (2) has probability density function (pdf) given by
Its hazard rate function (hrf) becomes
Further, the class of extended Weibull () distributions pioneered by Gurvich et al. (1997) has achieved a prominent position in lifetime models. Its cdf is given by
where H(x;ξ) is a non-negative monotonically increasing function which depends on the parameter vector ξ. The corresponding pdf is given by
where h(x;ξ) is the derivative of H(x;ξ).
Different expressions for H(x;ξ) in Equation (3) define important models such as:
-
(i)
H(x;ξ)=x gives the exponential distribution;
-
(ii)
H(x;ξ)=x 2 leads to the Rayleigh (Burr type-X) distribution;
-
(iii)
H(x;ξ)= log(x/k) leads to the Pareto distribution;
-
(iv)
H(x;ξ)=β −1[ exp(β x)−1] gives the Gompertz distribution.
In this paper, we derive a new family of distributions by compounding the and classes. We define a new generated family in order to provide a “better fit” in certain practical situations. The compounding procedure follows by taking the class (3) as the baseline model in Equation (2). The Marshall-Olkin extended Weibull () family of distributions contains some special models as those listed in Table 1 with the corresponding H(·;·) and h(·;·) functions and the parameter vectors.
The paper unfolds as follows. Section 2 presents the cdf and pdf of the proposed distribution and some expansions for the density function. The main statistical properties of the new family are derived in Section 3 including the moments, moment generating function (mgf) and incomplete moments, quantile function (qf), random number generator, skewness and kurtosis measures, order statistics, mean deviations and average lifetime functions. In Section 4, we derive four measures of information theory: Shannon and Rényi entropies, cross entropy and Kullback-Leibler divergence. The maximum likelihood method to estimate the model parameters is adopted in Section 5. Two special models are studied in some details in Section 6. We perform a simulation study using Monte Carlo’s experiments in order to assess the accuracy of the maximum likelihood estimators (MLEs) in Section 7.1 and two applications to real data in Section 7.2. Conclusions and some future lines of research are addressed in Section 8.
2 The family
The cdf of the new family of distributions is given by
where α>0 and δ>0. Using (5), we can express its survival function as
and the associated hrf reduces to
The corresponding pdf is given by
where H(x;ξ) can be any special distribution listed in Table 1.
Hereafter, let X be a random variable having the pdf (8) with parameters δ,α and ξ, say . Equation (8) extends several distributions which have been studied in the literature.
The Pareto (Ghitany 2005) is obtained by taking H(x; ξ)= log(x/k)(x≥k). Further, for H(x; ξ)=xγ we obtain the Weibull (Ghitany et al. 2005, Zhang and Xie 2007). The Lomax (Ghitany et al. 2007) and log-logistic are derived from (8) by taking H(x; ξ)= log(1+xc) with c=1 and H(x; ξ)= log(1+xc) with α=1, respectively. For H(x; ξ)=a x+b x2/2 and α=1, Equation (8) reduces to the linear failure rate (Ghitany and Kotz 2007). In the same way, for H(x; ξ)= log(1+xc), we have the Burr XII (Jayakumar and Mathew 2008). Finally, we obtain the Fréchet (Krishna et al. 2013) from Equation (8) by setting H(x; ξ)=x−γ. Table 1 displays some useful quantities and corresponding parameter vectors for special distributions.
A general approximate goodness-of-fit test for the null hypothesis H0:X1,…,X n with X i following F(x;θ), where the form of F is known but the p-vector θ=(δ,α,ξ)⊤ is unknown, was proposed by Chen and Balakrishnan (1995). This method is based on the Cramér-von Mises (CM) and Anderson-Darling (AD) statistics and, in general, the smaller the values of these statistics, the better the fit. In this paper, such methodology is applied to provide goodness-of-fit tests for the distributions under study.
Some results in the following sections can be obtained numerically in any software such as MAPLE (Garvan 2002), MATLAB (Sigmon and Davis 2002), MATHEMATICA (Wolfram 2003), Ox (Doornik 2007) and R (R Development Core Team 2009). The Ox (for academic purposes) and R are freely available at http://www.doornik.com and http://www.r-project.org, respectively. The results can be computed by taking in the sums a large positive integer value in place of ∞.
2.1 Expansions for the density function
For any positive real number a, and for |z|<1, we have the generalized binomial expansion
where (a) k =Γ(a+k)/Γ(a)=a(a+1)…(a+k−1) is the ascending factorial and Γ(·) is the gamma function. Applying (9) to (8), for 0<δ<1, gives
where and g(x;(j+1)α,ξ) denotes the density function with parameters (j+1)α and ξ. Otherwise, for δ>1, after some algebra, we can express (8) as
In this case, we can verify that |(1−1/δ)[1− exp(−α H(x;ξ))]|<1. Then, applying twice the expansion (9) in Equation (11), we obtain
where
We can verify that . Then, the density function can be expressed as an infinite linear combination of densities. Equations (10) and (12) have the same form except for the coefficients η j′s in (10) and ν j′s in (12). They depend only on the generator parameter δ. For simplicity, we can write
where
and η j and ν j are given by (10) and (12), respectively. Thus, some mathematical properties of (13) can be obtained directly from those properties. For example, the ordinary, incomplete, inverse and factorial moments and the mgf of X follow immediately from those quantities of the distribution.
3 General properties
3.1 Moments, generating function and incomplete moments
The n th ordinary moment of X can be obtained from (13) as
where from now on denotes a random variable having the density function g(y;(j+1)α,ξ).
The mgf and the k th incomplete moment of X follow from (13) as
and
where M j (t) is the mgf of Y j and comes directly from the model.
3.2 Quantile function and random number generator
The qf of X follows by inverting (5) and it can be expressed in terms of H−1(·) as
In Table 2, we provide the function H−1(x;ξ) for some special models.
Hence, the generator for X can be given by the algorithm:
The distributions can be very useful in modeling lifetime data and practitioners may be interested in fitting one of these models. We provide a script using the R language to generate the density, distribution function, hrf, qf, random numbers, Anderson-Darling test, Cramer-von Mises test and likelihood ratio (LR) tests. This script can be be obtained from the authors upon requested.
3.3 Mean deviations
The mean deviations of X about the mean and the median are given by
respectively, where μ=E(X) denotes the mean and M=M e d i a n(X) the median. The median follows from the nonlinear equation F(M;δ,α,ξ)=1/2. So, these quantities reduce to
where T1(z) is the first incomplete moment of X obtained from (14) as
and is the first incomplete moment of Y j .
An important application of the mean deviations is related to the Bonferroni and Lorenz curves. These curves are useful in economics, reliability, demography, medicine and other fields. For a given probability p, they are defined by B(p)=T1(q)/(p μ) and L(p)=T1(q)/μ, respectively, where q=Q(p) is the qf of X given by (15) at u=p.
3.4 Average lifetime and mean residual lifetime functions
The average lifetime is given by
In fields such as actuarial sciences, survival studies and reliability theory, the mean residual lifetime has been of much interest; see, for a survey, Guess and Proschan (1988). Given that there was no failure prior to x0, the residual life is the period from time x0 until the time of failure. The mean residual lifetime is given by
The last integral can be computed from the baseline distribution. Further, m(x0;δ,α,ξ)→E(X) as x0→0.
4 Information theory measures
The seminal idea about information theory was pioneered by Hartley (1928), who defined a logarithmic measure of information for communication. Subsequently, Shannon (1948) formalized this idea by defining the entropy and mutual information concepts. The relative entropy notion (which would later be called divergence) was proposed by Kullback and Leibler (1951). The Kullback-Leibler’s measure can be understood like a comparison criterion between two distributions. In this section, we derive two classes of entropy measures and one class of divergence measures which can be understood as new goodness-of-fit quantities such those discussed by Seghouane and Amari (2007). All these measures are defined for one element or between two elements in the family.
4.1 Rényi entropy
The Rényi entropy of X with pdf (8) is given by
where s∈(0,1)∪(1,∞).
It is a difficult problem to obtain in closed-form for the family. So, we derive an expansion for this quantity.
By using (9), f(x;δ,α,ξ)s can be expanded as
where
The proof of this expansion is given in Appendix 8.
Finally, based on Equation (16), the Rényi entropy can be expressed as
An advantage of this expansion is its dependence of an integral which has closed-form for some distributions.
4.2 Shannon entropy
The Shannon entropy of X is given by
where the log-likelihood function corresponding to one observation follows from (8) as
Thus, it can be reduced to
4.3 Cross entropy and Kullback-Leibler divergence and distance
Let X and Y be two random variables with common support whose densities are f X (x;θ1) and f Y (y;θ2), respectively. Cover and Thomas (1991) defined the cross entropy as
We consider that and . After some algebraic manipulations, we obtain
An important measure in information theory is the Kullback-Leibler divergence given by
Applying (4.2) and (17) in Equation (18) gives
According to Cover and Thomas (1991), the Kullback-Leibler measure D(X||Y) is the quantification of the error considering that the Y model is true when the data follow the X distribution. For example, this measure has been proposed as essential parts of test statistics, which has seen strongly applied to contexts of radar synthetic aperture image processing in both univariate (Nascimento et al. 2010) and polarimetric (or multivariate) (Nascimento et al. 2014) perspectives.
In order to work with measures that satisfied the non-negativity, symmetry and definiteness properties, Nascimento et al. (2010) considered the symmetrization of (19)
which is given by
Although this measure does not satisfy the triangle inequality, it is usually called the Kullback-Leibler distance (Jensen-Shannon divergence). The new measure can be used to answer questions like “how could one quantify the difference in selecting the Phani model with three parameters as the baseline distribution instead of the Weibull Kies distribution which has four parameters?”.
As an illustration for (20), we initially consider two distinct elements of the generated special model from the specifications: H(x;β)=β−1[ exp(β x)−1] and h(x;β)= exp(β x) in (8). This model will be presented with more details in future sections and its parametric space is represented by the vector (δ,α,β). Suppose that we are interested in quantifying the influence of a nuisance degree ε in the parameter α over the distance between two distinct elements, (2,1,3) and (2,1+ε,3), at such parametric space. Figure 1(a) displays the integrand of (20) for ε=0.1, 1, 2 and 4 for which the distances (or areas) associated with dKL(X,Y) are 6.50×10−3, 3.56×10−1, 9.46×10−1 and 2.25, respectively. It is notable that dKL(X,Y) takes smaller values for more closer points (or, equivalently, for more closer fits) and, therefore, (20) consists of new goodness-of-fit measures. In Figures 1(b) and 1(c), we show the influence of η=α/β on dKL([δ,α,β],[δ,α,β+ε]) (for β=δ=3 and α∈{1,3,9}) and of δ on dKL([δ,α,β],[δ+ε,α,β]) (for β=α=3 and δ∈{3,4,5}). For all cases, the contamination ε takes values in the interval (−2.9,2.9).
5 Estimation
Here, we present a general procedure for estimating the parameters from one observed sample and from multi-censored data. Additionally, we provide a discussion about how one can test the significance of additional parameter at the proposed class. Let x1,…,x n be a sample of size n from X. The log-likelihood function for the vector of parameters θ=(δ,α,ξ⊤)⊤ can be expressed as
From the above log-likelihood, the components of the score vector, , are given by
Finally, the partitioned observed information matrix for the family is
whose elements are
When some standard regularity conditions are satisfied (Cox and Hinkley 1974), one can verify that converges in distribution to the multivariate distribution, where p denotes the dimension of ξ and is the expected information matrix for which the limit identity is satisfied. Based on this result, one can compute confidence regions for the parameters. Such regions can be used as decision criteria in several practical situations.
For checking if δ is statistically different from one, i.e. for testing the null hypothesis H0:δ=1 against H1:δ≠1, we use the LR statistic given by , where is the vector of unrestricted MLEs under H1 and is the vector of restricted MLEs under H0. Under the null hypothesis, the limiting distribution of LR is a distribution. If the test statistic exceeds the upper 100(1−α)% quantile of the distribution, then we reject the null hypothesis.
Censored data occur very frequently in lifetime data analysis. Some mechanisms of censoring are identified in the literature as, for example, types I and II censoring (Lawless 2003). Here, we consider the general case of multi-censored data: there are n=n0+n1+n2 subjects of which n0 is known to have failed at the times , n1 is known to have failed in the interval [ si−1,s i ], i=1,…,n1, and n2 survived to a time r i , i=1,…,n2, but not observed any longer. Note that type I censoring and type II censoring are contained as particular cases of multi-censoring. The log-likelihood function of θ=(δ,α,ξ⊤)⊤ for this multi-censoring data reduces to
The score functions and the observed information matrix corresponding to (21) is too complicated to be presented here.
6 Two special models
In this section, we study two special models, namely the Marshall-Olkin modified Weibull () and Marshall-Olkin Gompertz () distributions. We provide plots of the density and hazard rate functions for some parameters to illustrate the flexibility of these distributions.
6.1 The model
For H(x;λ,γ)=xγ exp(λ x) and h(x;λ,γ)=xγ−1 exp(λ x)(γ+λ x), we obtain the distribution. Its density function is given by
where λ,γ≥0. If δ=1, it leads to the special case of the modified Weibull () distribution (Lai et al.2003). In addition, when λ=0, it gives the Weibull distribution. Its cdf and hrf are given by
and
respectively. In Figures 2(a), 2(b), 2(c) and 2(d), we note some different shapes of the pdf. Further, Figures 3(a), 3(b), 3(c) and 3(d) display plots of the hrf, which can have increasing, decreasing, non-monotone and bathtub forms.
The r th raw moment of the distribution comes from (13) as
where denotes the r th raw moment of the distribution with parameters (j+1)α,γ and λ. Carrasco et al. (2008) determined an infinite representation for μ r (j) given by
where
and
Hence, the moments can be obtained directly from (22) and (23).
Let x1,…,x n be a sample of size n from . The log-likelihood function for the vector of parameters θ=(α,δ,λ,γ)⊤ can be expressed as
6.2 The model
For H(x;β)=β−1[ exp(β x)−1] and h(x;β)= exp(β x), we obtain the distribution. Its pdf is given by
where −∞<β<∞. For δ=1, it follows the Gompertz distribution as a special case. The model is a special case of the Marshall-Olkin Makeham distribution (EL-Bassiouny and Abdo 2009). The cdf and hrf of the distribution are given by
and
Figures 4(a), 4(b) and 4(c) display some plots of the density functions for some values of α, δ and β. The hrf of the Gompertz distribution is increasing (β>0) and decreasing (β<0). Besides these two forms, Figures 5(a), 5(b) and 5(c) indicate that the hrf can be bathtub shaped.
From Equation (15), the qf becomes
Let x1,…,x n be a sample of size n from the model. The log-likelihood function for the vector of parameters θ=(δ,α,β)⊤ can be expressed as
7 Simulation and applications
This section is divided in two parts. First, we perform a simulation study in order to assess the performance of the MLEs on some points at the parametric space of one of the special models. Second, an application to real data provides evidence in favor of one distribution in the class.
7.1 Simulation study
We present a simulation study by means of Monte Carlo’s experiments in order to assess the performance of the MLEs described in Section 5. To that end, we work with the distribution. One of advantages of this model is that its cdf has tractable analytical form. This fact implies in a simple random number generation (RNG) determined by the qf given in Section 6.2. The generator is illustrated in Figure 6.
The simulation study is conducted in order to quantify the influence of η=α/β over the estimation of the extra parameter δ. It is known that η>1 gives the Gompertz distribution which presents mode at zero or, for η<1, having their modes at x∗=β−1 [1 − log(η)]. An initial discussion using the Kullback-Leibler distance derived in Section 4.3 points out that increasing the contamination (or the bias of the estimates) can affect the quality of fit.
In this study, the following scenarios are taken into account. For the sample size n=50,100,150,200, we adopt as the true parameters the following cases:
-
Scenario η<1: (α,β)=(1,2) and δ∈{0.3,1,4};
-
Scenario η=1: (α,β)=(2,2) and δ∈{0.3,1,4};
-
Scenario η>1: (α,β)=(4,2) and δ∈{0.3,1,4}.
Also, we use 10,000 Monte Carlo’s replications and, at each one of them, we quantify (i) the average of the MLEs and (ii) the mean square error (MSEs).
Table 3 gives the results of the simulation study. In general, the MLEs present smaller values of the biases and MSEs when the sample size increases. It is important to highlight the following atypical case: for the MLEs of α at the scenarios (α,δ,β)∈{(1,4,2),(2,1,2),(4,0.3,2),(4,1,2)} and of δ at (4,0.3,2), the associated biases do not have an inverse monotonic relationship with sample sizes, as expected.However, based on the fact that their MSEs tend to zero, we can expect that there exists a sample size n0 such that biases of the MLEs decrease when the sample sizes increase from n0.
The results provide evidence that the scenarios under the condition η>1 yield a hard estimation (having larger variation ranges of the MSEs than those obtained for the cases when η<1) for α and β parameters, and that the MLEs present smaller values of the MSEs under such conditions. Figure 7 illustrates the above behavior for the cases δ∈{0.3,0.8,1,2,4} and n=200. In summary, the scenario with less numerical problems is (η,δ)=(2,0.1), whereas that one which requires more attention for estimating the parameters is (η,δ)=(0.5,4).
7.2 Applications
Here, the usefulness of the distribution is illustrated by means of two real data sets.
7.2.1 Uncensored data
Here, we compare the fits of some special models of the family using a real data set. The estimation of the model parameters is performed by the maximum likelihood method discussed in Section 5. We use the maxLik function of the maxLik package in R language. In this function, if the argument “method” is not specified, a suitable method is selected automatically. For this application, we use the Newton-Raphson method. The data represent the percentage of body fat determined by underwater weighing for 250 men. For more details about the data see http://lib.stat.cmu.edu/datasets/bodyfat.
Table 4 provides some descriptive measures. They suggest an empirical distribution which is slightly asymmetric and platykurtic.
We compare the classical models and generalized models within the family. The null hypothesis H0:δ=1 is tested against H1:δ≠1 using the LR statistic. The comparisons are presented in Table 5. For the and models, one cannot say that the parameter δ is statistically different from one at the 10% significance level. Based on this result, we fit the , exponential power (), and Marshall-Olkin flexible Weibull extension () models to the current data (see Table 1). These models are compared with two other three-parameter models, namely: the modified Weibull () and generalized Birnbaum-Saunders () (Owen 2006) distributions. The density is given by
In Table 6, we present the MLEs (standard errors in parentheses) of the parameters of the fitted , , , , and distributions. Also, we provide the goodness-of-fit measures (p-values in parentheses). Thus, these values indicate that the null models are strongly rejected for the and distributions, since the associated p-values are much lower than 0.001.
Table 7 gives the values of the Akaike information criterion (AIC), Bayesian information criterion (BIC), consistent Akaike information criterion (CAIC) and Hannan-Quinn information criterion (HQIC). Since the values of the AIC, CAIC and HQIC are smaller for the distribution compared to those values of the other fitted models. Thus, this new distribution seems to be a very competitive model to explain the current data.
Figures 8(a) and 8(b) display the estimated density and survival functions of the distribution. The plots confirm the excellent fit of this distribution to the data. Figure 8(c) shows that the estimated hrf is an increasing curve.
7.2.2 Censored data
Now, we consider a set of remission times from 137 cancer patients [Lee and Wang (2003), pag. 231]. Lee and Wang (2003) showed that the log-logistic () model provides a good fit to the data. Ghitany et al. (2005) compared the fits of the and models to these data. Now, we present a more detailed study by comparing the fitted , , , , Marshall-Olkin log-logistic (), and models to these data. The functions H(x;γ,c)= log(1+γ xc) and h(x;γ,c)=γ c xc−1/(1+γ xc) are associated with the model.
The hypothesis that the underlying distribution is (or ) versus the alternative hypothesis that the distribution is the (or ) is rejected with p-value = 0.0055 (or p-value = <0.0001). Further, the hypothesis test that the underlying distribution is versus the distribution yields the p-value =1.0000. Thus, we compare the , , and models to determine which model gives the best fit to the current data.
Table 8 lists the MLEs (and corresponding standard errors in parentheses) of the parameters and the values of the AD and CM statistics (their p-values in parentheses). The figures in this table, specially the p-values, suggest that the distribution yields a better fit to these data than the other three distributions.
Table 9 lists the values of the AIC, BIC, CAIC and HQIC statistics. The figures in this table indicate that there is a competitiveness among the , and models. However, if we observe the Figures 9(a), 9(b) and 9(c), we note that the and models present better fits to the current data.
Figure 9(d) really shows that the and distributions present good fits to the current data. We can conclude that the and distributions are excellent alternatives to explain this data set.
8 Conclusion
In this paper, the Marshall-Olkin extended Weibull family of distributions is proposed and some of its mathematical properties are studied. The maximum likelihood procedure is used for estimating the model parameters. Two special models in the family are described with some details. In order to assess the performance of the maximum likelihood estimates, a simulation study is performed by means of Monte Carlo experiments. Special models of the proposed family are compared (through goodness-of-fit measures) with other well-known lifetime models by means of two real data sets. The proposed model outperforms classical lifetime models to these data.
Appendix: An expansion for f(x;δ,α,ξ)F(x;δ,α,ξ)c
Here, we obtain an expansion for the quantity f(x;δ,α,ξ)F(x;δ,α,ξ)c. First, we consider an expansion for F(x;δ,α,ξ)c. Based on (5), the power of the cdf can be expressed as
Applying expansion (9), we have
Now, we expand the quantity B. Equation (9) under the restriction δ<1 (implying that ) yields
Moreover, it is clear that δ=1 implies B=1. Finally, for δ>1 (i.e., ), the quantity B can be rewritten as
Using the binomial expansion, we have
Thus,
Hence, based on Equation (13), the following expansion holds
References
Bain LJ: Analysis for the linear failure-rate life-testing distribution. Technometrics 16: 551–559.
Barreto-Souza W, Lemonte AJ, Cordeiro GM: General results for the Marshall and Olkin’s family of distributions. An. Acad. Bras. Cienc 85: 3–21.
Bebbington M, Lai CD, Zitikis R: A flexible Weibull extension. Reliability Eng. Syst. Saf 92: 719–726.
Carrasco JMF, Ortega EMM, Cordeiro GM: A generalized modified Weibull distribution for lifetime modeling. Comput. Stat. Data Anal 53: 450–462.
Chen Z: A new two-parameter lifetime distribution with bathtub shape or increasing failure rate function. Stat. Probability Lett 49: 155–161.
Chen G, Balakrishnan N: A general purpose approximate goodness-of-fit test. J. Qual. Technol 27: 154–161.
Cordeiro GM, Lemonte AJ: On the Marshall-Olkin extended Weibull distribution. Stat. Paper 54: 333–353.
Cover TM, Thomas JA: Elements of Information Theory. John Wiley & Sons, New York;
Cox DR, Hinkley DV: Theoretical Statistics. Chapman and Hall, London;
Doornik J: Ox 5: object-oriented matrix programming language. Timberlake Consultants, London; (2007)
EL-Bassiouny AH, Abdo NF: Reliability properties of extended makeham distributions. Comput. Methods Sci. Technol 15: 143–149. (2009)
Fisk PR: The graduation of income distributions. Econometrica 29: 171–185.
Fréchet M: Sur la loi de probabilite de l’écart maximum.́. Ann. Soc. Polon. Math 6: 93–93. (1927)
Garvan F: The Maple Book. Chapman and Hall/CRC, London; (2002)
Gompertz B: On the nature of the function expressive of the law of human mortality and on the new model of determining the value of life contingencies. Philos. Trans. R. Soc. Lond 115: 513–585. (1825)
Guess F, Proschan F: Mean residual life: Theory and applications. In Handbook of Statistics, vol. 7. Edited by: Krishnaiah PR, Rao CR. Elsevier; http://dx.doi.org/10.1016/S0169-7161(88)07014-2
Ghitany ME: Marshall-Olkin extended Pareto distribution and its application. Int. J. Appl. Math 18: 17–31.(2005)
Ghitany ME, Kotz S: Reliability properties of extended linear failure-rate distributions. Probability Eng. Informational Sci 21: 441–450. (2007)
Ghitany ME, AL-Hussaini EK, AL-Jarallah: Marshall-Olkin extended Weibull distribution and its application to Censored data. J. Appl. Stat 32: 1025–1034. (2005)
Ghitany ME, AL-Awadhi FA, Alkhalfan LA: Marshall-Olkin extended Lomax distribution and its applications to censored data. Comm. Stat. Theor. Meth 36: 1855–1866. (2007)
Gurvich M, DiBenedetto A, Ranade S: A new statistical distribution for characterizing the random strength of brittle materials. J. Mater. Sci 32: 2559–2564. (1997)
Hartley RVLL: Transmission of information. Bell Syst. Techn. J 7: 535–563. (1928)
Jayakumar K, Mathew T: On a generalization to Marshall-Olkin scheme and its application to Burr type XII distribution. Stat. Paper 49: 421–439. (2008)
Johnson NL, Kotz S, Balakrishnan N: Continuous Univariate Distributions. Wiley, New York; (1994)
Kies JA: The Strength of Glass, NRL Report 5093. Naval Research Lab., Washington, DC (1958)
Krishna E, Jose KK, Ristić M: Applications of Marshal-Olkin Fréchet distribution. Comm. Stat. Simulat. Comput 42: 76–89. (2013)
Kullback S, Leibler RA: On information and sufficiency. Ann. Math. Stat 22: 79–86. 1951
Lai CD, Xie M, Murthy DNP: A modified Weibull distribution. Trans. Reliab 52: 33–37. 2003
Lawless JF: Statistical Models and Methods for Lifetime Data. Wiley, New York; 2003
Lee ET, Wang JW: Statistical Methods for Survival Data Analysis. Wiley, New York; 2003
Lomax KS: Business failures; another example of the analysis of failure data. J. Am. Stat. Assoc 49: 847–852. 1954
Marshall A, Olkin I: A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 84: 641–652. 1997
McCool JI: Using the Weibull Distribution: Reliability, Modeling and Inference. John Wiley & Sons, New Jersey; 2012
Nadarajah S, Kotz S: On some recent modifications of Weibull distribution. IEEE Trans. Reliab 54: 561–562. 2005
Nascimento ADC, Cintra RJ, Frery AC: Hypothesis testing in speckled data with stochastic distances. IEEE Trans. Geosci. Remote Sensing 48: 373–385. 2010
Nascimento ADC, Horta MM, Frery AC, Cintra RJ: Comparing edge detection methods based on stochastic entropies and distances for PolSAR imagery. IEEE J. Selected Topics Appl. Earth Observations Remote Sensing 7: 648–663. 2014
Nikulin M, Haghighi F: A chi-squared test for the generalized power Weibull family for the head-and-neck cancer censored data. J. Math. Sci 133: 1333–1341. 2006
Owen WJ: A new three-parameter extension to the Birnbaum-Saunders distribution. IEEE Trans. Reliab 55: 475–479. 2006
Pham H: A vtub-shaped hazard rate function with applications to system safety. Int. J. Reliab. Appl 3: 1–16. 2002
Phani KK: A new modified Weibull distribution function. Commun. Am. Ceramic Soc 70: 182–184. 1987
R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna; 2009
Rayleigh JWS: On the resultant of a large number of vibrations of the same pitch and of arbitrary phase. Phil. Mag 10: 73–78. 1880
Ristić MM, Jose KK, Ancy J: A Marshall-Olkin gamma distribution and minification process. STARS: Stress Anxiety Res. Soc 11: 107–117. 2007
Rodriguez N: A guide to the Burr type XII distributions. Biometrika 64: 129–134. 1977
Sankaran PG, Jayakumar K: On proportional odds model. Stat. Paper 49: 779–789. 2008
Seghouane A-K, Amari S-I: The AIC criterion and symmetrizing the Kullback–Leibler divergence. IEEE Trans. Neural Netw 18: 97–106. 2007
Shannon CE: A mathematical theory of communication. Bell Syst. Techn. J 27: 379–423. 1948
Sigmon K, Davis TA: MATLAB Primer. Chapman and Hall/CRC, London; 2002
Smith RM, Bain LJ: An exponential power life testing distribution. Comm. Stat. Theor. Meth 4: 469–481. 1975
Wolfram S: The Mathematica Book. Wolfram Media, Cambridge; 2003
Xie M, Lai D: Reliability analysis using additive Weibull model with bathtub-shaped failure rate function. Reliab. Eng. Syst. Saf 52: 87–93. 1995
Xie M, Tang Y, Goh TN: A modified Weibull extension with bathtub-shaped failure rate function. Reliab. Eng. Syst. Saf 76: 279–285. 2002
Zhang T, Xie M: Failure data analysis with extended Weibull distribution. Comm. Stat. Simulat. Comput 36: 579–592. 2007
Acknowledgements
The authors gratefully acknowledge financial support from CAPES and CNPq. The authors are also grateful to three referees and an associate editor for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
The authors MS-N, MB, LMZ, ADCN and GMC proposed a new class of models named the Marshall-Olkin extended Weibull distributions and investigated some of its structural properties including ordinary and incomplete moments, generating and quantile functions, mean deviations, information theory measures and some types of entropies. Two special models were discussed and the estimation of the family model parameters was performed by maximum likelihood. They provided a simulation study and two applications to real data. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Santos-Neto, M., Bourguignon, M., Zea, L.M. et al. The Marshall-Olkin extended Weibull family of distributions. J Stat Distrib App 1, 9 (2014). https://doi.org/10.1186/2195-5832-1-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/2195-5832-1-9