A useful extension of the Burr III distribution
 Gauss M. Cordeiro^{1},
 Antonio E. Gomes^{2}Email authorView ORCID ID profile,
 Cibele Q. daSilva^{2} and
 Edwin M. M. Ortega^{3}
https://doi.org/10.1186/s404880170079y
© The Author(s) 2017
Received: 18 November 2016
Accepted: 13 October 2017
Published: 1 November 2017
Abstract
For any continuous baseline G distribution, Zografos and Balakrishnan (Statistical Methodology 6:344–362, 2009) introduced the gammagenerated family of distributions with an extra shape parameter. Based on this family, we define a new fourparameter extension of the Burr III distribution. It can have decreasing, unimodal and decreasingincreasingdecreasing hazard rate function. We provide a comprehensive account of some of its structural properties. We propose a new loggamma Burr III regression model, which is a feasible alternative for modeling the four existing types of failure rates. Two applications to real data sets and a simulation study illustrate the performance of the new models.
Keywords
Introduction
Adding new shape parameters to expand a parent distribution plays a fundamental role to generate a larger family with a wide range of skewness and light or heavy tails. Several mathematical properties of the extended family may be easily explored using linear combination of exponentiatedG (“expG” for short) distributions. Further, this induction of parameters has been proved useful in investigating tail properties and also for improving the goodnessoffit of the generator family. The wellknown generators are the following ones: betaG by Eugene et al. (2002), KumaraswamyG (KwG) by Cordeiro and de Castro (2011), McDonaldG (McG) by Alexander et al. (2012), gammaG by Zografos and Balakrishnan (2009), among others. Recently, several distribution generators have been proposed, for example, Alzaatreh et al. (2013) proposed the the TX family of distributions, Cordeiro et al. (2014) introduced the the Lomax generator, Cordeiro et al. (2015) defined a new generalized Weibull family, Tahir et al. (2015) studied the odd generalized exponential family, Nofal et al. (2016) proposed the generalized transmutedG family and Cordeiro et al. (2017) investigated the generalized odd loglogistic family.
respectively, where g(x)=d G(x)/d x, \(\Gamma (c)=\int _{0}^{\infty } t^{c1}\,{\mathrm {e}}^{t}dt\) denotes the gamma function, and \(\gamma (c,z)=\int _{0}^{z} t^{c1}\,{\mathrm {e}}^{t}dt\) denotes the incomplete gamma function. The gammaG family has the same parameters of the G distribution plus an extra shape parameter c. The increase of one parameter is the price to pay for adding more flexibility to the generated model compared to G. For c=1, Eq. (1) becomes the density function g(x), which is a positive point. The parameter c can provide greater flexibility in the form of the generated distribution and, consequently, it can be a very useful model for fitting positive data.
For a random variable X with pdf (1), we have \(X\overset {d}{=} G^{1}(1{\mathrm {e}}^{Z})\), where Z ∼ Gamma (c,1). If c=1, then Z∼ exp(1) and \(X\overset {d}{=} G^{1}(U)\), where U∼U(0,1).
The BIII distribution has been used in various fields of sciences and its features extensively analyzed. It appeared under the name of the Dagum (1977) distribution in studies of income, wage and wealth distributions. For an excellent survey on its genesis and empirical applications, see Kleiber and Kotz (2003) and Kleiber (2008). It is known as the inverse Burr distribution (see, e.g., Klugman et al. 1998) in the actuarial literature and as the kappa distribution in the meteorological area (Mielke, 1973). This distribution has also been employed in finance, environmental studies, survival analysis and reliability theory (see Lindsay et al. 1996; Gove et al. 2008).
for 0<u<1, where Q ^{−1}(c,u) is the inverse function of Q(c,x)=1−γ(c,x)/Γ(c); see, for details, http://functions.wolfram.com/GammaBetaErf/InverseGammaRegularized/. One can also use (8) for simulating GBIII variates: if U is a uniform random variable on the unit interval (0,1), then X=Q(U) will be a GBIII random variable.
The rest of the paper is outlined as follows. In Section Structural properties of the GBIII distribution, we obtain some structural properties of the GBIII distribution and estimate the model parameters by maximum likelihood. We propose a new regression model based on the logarithm of this distribution in Section The loggamma Burr III regression model. Two applications to real data and a simulation study are addressed in Section Applications and simulation to prove empirically the flexibility of the new models. Finally, some conclusions are offered in Section Conclusions.
Structural properties of the GBIII distribution
In the following subsections, we obtain a linear representation for the density function and a power series for the qf of the new distribution and estimate its parameters. These expressions can be computed numerically in platforms such as MAPLE, MATHEMATICA, Ox and R using a large number in the upper limit instead of infinity.
Linear representation
for k=1,2,… and p _{ j,0}=1.
where g _{ α,(c+k)β,s }(x) denotes the BIII pdf in (4) with parameters α, (c+k)β and s. So, several mathematical properties of the GBIII distribution can be obtained from those of the BIII distribution using (10) in platforms such as MAPLE and MATHEMATICA.
Equation (10) holds for any real parameter c>0 and then some mathematical properties of the new model are valid in the same parameter space, where those properties of the BIII model hold. Evidently, the integrals for the ordinary and incomplete moments and generating function of X can also be computed numerically in Ox and R.
Quantile expansion
where Δ(i)=0 if i<2 and Δ(i)=1 if i≥2. The first few coefficients are m _{2}=1/(c+1), m _{3}=(3c+5)/[2(c+1)^{2}(c+2)],… Let m _{0}=0 and define q _{ i }=m _{ i+1} Γ(c+1)^{(i+1)/2} (for i=0,1,2…).
where (for k≥0) \(d_{k,0}=q_{0}^{k}\) and, for i=1,2…, \(d_{k,i}=(i\,q_{0})^{1}\,\sum _{j=1}^{i}[j(k+1)i]\,q_{j}\,d_{k,ij}\).
where (for m≥0) \(s_{m}=\sum _{n=1}^{\infty }\frac {\tau _{n}}{n!}\,\delta _{n,m}\), and (for n≥1) \(\delta _{n,m}=(m\,\nu _{0})^{1}\,\sum _{p=1}^{m}\,[p\,(n+1)m]\,\nu _{m}\,\delta _{n,mp}\), and \(\delta _{n,0}=\nu _{0}^{n}\).
Equations (13) and (14) are the main results of this section since we can obtain from them various GBIII mathematical quantities (moments, generating function, etc). In fact, some of them follow by using the right integral for special W(·) functions, which are sometimes simpler than if they are based on the left integral.
Maximum likelihood estimation
Maximization of (15) can be performed by using well established routines like nlminb or optimize in the R statistical package. The routines are able to locate the maximum in all cases if we take different starting values for the parameters. However, it is desirable to have reasonable starting values, which can be chosen using the estimates from the fitted BIII distribution.
The MLEs in θ, say \(\widehat {\boldsymbol {\theta }}\), can also be determined numerically as simultaneous solutions of the equations ∂ ℓ(θ)/∂ θ=0. For interval estimation of the components in θ, we require the observed information matrix for θ, say \(\ddot {\mathbf {L}}(\boldsymbol {\theta })=\{L_{rv}\}\) (r,v=c,α,β,s), whose elements can be obtained from the authors upon request.
The loggamma Burr III regression model
Different forms of regression models have been studied in survival analysis. Among them, the locationscale regression model (Lawless 2003) is distinguished since it is frequently used in clinical trials. We propose a new locationscale regression model based on the logarithm of the GBIII random variable named the loggamma Burr III (LGBIII) regression model as a feasible alternative for modeling the four existing types of hazard rates.
where \(y \in \mathbb {R}\), \(\mu \in \mathbb {R}\) is the location parameter, σ>0 is the scale parameter, and c>0 and β>0 are shape parameters. We refer to Eq. (16) as the LGBIII distribution, say Y∼LGBIII(c,β,σ,μ). The density of the random variable Z=(Y−μ)/σ follows from (16).
where the random error z _{ i } has density function (16) with μ=0 and σ=1, τ=(τ _{1},…,τ _{ p })^{ T }, σ>0, c>0 and β>0 are unknown parameters. The parameter \(\mu _{i}=\mathbf {v}_{i}^{T} {\boldsymbol {\tau }}\) is the location of y _{ i }. The location parameter vector μ=(μ _{1},…,μ _{ n })^{ T } is represented by a linear model μ=V τ, where V=(v _{1},…,v _{ n })^{ T } is a known model matrix. The LGBIII regression model (18) opens new possibilities for fitting many different types of data. For example, it contains as submodel the logBurr III (LBIII) regression model when c=1.
where \(z_{i}=(y_{i}\mathbf {v}_{i}^{T}{\boldsymbol {\tau }})/\sigma \) and r is the number of the uncensored observations (failures). The MLE \(\widehat {{\boldsymbol {\theta }}}\) can be evaluated by maximizing (19). We use the NLMixed procedure in SAS to obtain \(\widehat {{\boldsymbol {\theta }}}\). Initial values for τ and σ are taken from the fitted LBIII regression model (with c=1). We can fit the LBIII model to the uncensored observations only and then take the parameter estimates as initial values to fit the LGBIII regression model.
Under firstorder asymptotic theory, the (p+3)×(p+3) asymptotic covariance matrix K(θ)^{−1} of \(\widehat {\boldsymbol {\theta }}\), where K(θ) is the expected information matrix for θ, can be approximated by the inverse of the observed information matrix \(\ddot {\mathbf {L}}({\boldsymbol {\theta }})\). The elements of this matrix can be computed numerically to construct approximate confidence intervals for the parameters in θ. We can use likelihood ratio (LR) statistics for comparing some submodels with the LGBIII model in the classical way.
Applications and simulation
Application of GBIII to cigarettes data
In order to illustrate the estimation results in Section 3, we work with carbon monoxide measurements made in several brands of cigarettes in 1994. The data have been collected by the Federal Trade Commission (FTC), which is an independent agency of the United States Government, whose main mission is the promotion of consumer protection.
MLEs of the model parameters for the cigarettes data, the corresponding SEs (given in parentheses) and the AIC measure
Model  α  β  s  c  b  γ  AIC 

GBIII  17.0973  0.0282  1.4992  2.6338      336.827 
(0.1153)  (0.0006)  (0.0028)  (0.0287)      
BIII  18.4311  0.1069  1.6511  1      938.113 
(0.1482)  (0.0010)  (0.0015)  ()      
LL  3.7222  1  1.1094  1      4554.549 
(0.0092)  ()  0.0014  ()      
ELL  1  3.6292  0.2635  1      1022.258 
()  0.0269  0.0023  ()      
BBIII  28.7468  0.5810  1.5547  0.1143  0.4787    348.247 
(0.5524)  (0.0190)  (0.0042)  (0.0041)  (0.0131)    
BW  5.0892      0.4410  3.8626  2.0235  368.887 
(0.0321)      (0.0037)  (0.1545)  (0.0166) 
LR tests for the cigarettes data
Model  Hypotheses  Statistic LR  pvalue 

GBIII vs BIII  H _{0}:a=1 vs H _{1}:H _{0} is false  10.79832  0.0010159 
GBIII vs LL  H _{0}:c=β=1 vs H _{1}:H _{0} is false  189.9402  0.0000000 
GBIII vs ELL  H _{0}:c=α=1 vs H _{1}:H _{0} is false  587.5724  0.0000000 
The measures of skewness and kurtosis for the GBIII distribution are, respectively, 0.3001116 and 0.05107192.
Goodnessoffit statistics
Distribution  W ^{∗}  A ^{∗} 

GBIII  0.23988  1.46001 
BIII  0.30316  1.83086 
LL  2.47901  13.86572 
ELL  5.53160  30.68406 
BBIII  0.29278  1.75430 
BW  0.66758  3.89611 
Some computing issues and a simulation study
As mentioned before, the optimization for estimating the parameters can be performed by minimizing the negative loglikelihood and, for that, we use the nlminb function of the R language. Optimization can also be tackled through simulated annealing (Kirkpatrick et al. 1983) using the optim function of the R. Reasonable starting values are chosen such that the estimated pdf of a submodel fits well the histogram of the data. We now discuss some estimation issues related to the GBIII distribution. Mäkeläinen et al. (1981), in their Theorem 2.1, have established conditions for existence and uniqueness of the MLEs. However, proving that the likelihood function satisfies those conditions is a very hard task that could be addressed in a separate paper.
We conduct a Monte Carlo simulation study to assess the finite sample behaviour of the MLEs of the GBIII parameters. Random samples from the GBIII model are obtained using the qf given by (8). We consider as the true parameter values the average between two parameter vector estimates obtained for the cigarettes data when two different starting points where chosen. Even though those starting points differ substantially, the estimates do not.
Monte Carlo results: means and SRMSEs (in parentheses) of \(\hat {\alpha }\), \(\hat {\beta }\), \(\hat {s}\) and \(\hat {c}\)
Parameter  α  β  s  c 

True values  17.11192  0.02785  1.49817  2.64952 
n=50  20.60594  0.06942  1.51430  2.56537 
(10.48504)  (0.09636)  (0.12745)  (1.55388)  
n=100  19.09881  0.04589  1.50674  2.61276 
(8.89715)  (0.05139)  (0.09234)  (1.04975)  
n=500  17.38320  0.03260  1.50329  2.59845 
(1.77489)  (0.01378)  (0.04358)  (0.48186)  
n=1,000  17.22701  0.03018  1.49953  2.63054 
(1.27852)  (0.00780)  (0.03029)  (0.31852)  
n=5,000  17.16906  0.02872  1.50008  2.63229 
(0.49828)  (0.00329)  (0.01300)  (0.14757) 
An application of the LGBIII regression model

Group 1: Control 1 (deionized water); Control 2 (acetone  5%); aqueous extract of seeds (AES) (39 ppm); AES (225 ppm); AES (888 ppm); methanol extract of leaves (MEL) (225 ppm); MEL (888 ppm); and dichloromethane extract of branches (DMB) (39 ppm).

Group 2: MEL (39 ppm); DMB (225ppm) and DMB (888 ppm).
MLEs of the parameters from the LGBIII and LBIII regression models fitted to the entomology data, the corresponding SEs (given in parentheses), pvalues in [.] and the AIC, CAIC and BIC statistics
Model  c  β  σ  τ _{0}  τ _{1}  τ _{2}  AIC  CAIC  BIC 

LGBIII  0.1722  1.7304  0.1685  3.4935  − 0.0567  − 0.2830  273.7  274.2  292.6 
(0.0161)  (0.2708)  (0.0244)  (0.0624)  (0.0534)  (0.0574)  
[ < 0.001]  [0.2890]  [ < 0.001]  
LBIII  1  0.4016  0.2199  3.4048  0.0402  − 0.3410  338.8  339.2  354.6 
  (0.0807)  (0.0286)  (0.0835)  (0.0843)  (0.0888)  
[ < 0.001]  [0.6339]  [0.0002] 
LR statistic w for the entomology data
Model  Hypotheses  w  pvalue 

LGBIII vs LBIII  H _{0}:c=1 vs H _{1}:H _{0} is false  67.1  < 0.00001 
MLEs of the parameters from the LGBIII and LBIII regression models fitted to the entomology data, considering only the significant variables, the corresponding SEs (given in parentheses), pvalues in [.] and the AIC, CAIC and BIC statistics
Model  c  β  σ  τ _{0}  τ _{2}  AIC  CAIC  BIC 

LGBIII  6.3762  0.005080  0.1157  2.9722  − 0.1864  256.3  256.6  272.0 
(1.6458)  (0.0062)  (0.0136)  (0.0843)  (0.0664)  
[ < 0.001]  [0.0056]  
LBIII  1  0.4084  0.2218  3.4173  − 0.3402  337.1  337.3  349.6 
  (0.0791)  (0.0279)  (0.0791)  (0.0886)  
[ < 0.001]  [0.0002] 
Conclusions
Providing a new class of distributions is always precious for statisticians. There has been an increased interest in developing generalized classes of distributions by adding a single shape parameter to a baseline distribution. There is no doubt that some of these classes have attracted several applied researchers. Following this idea, Zografos and Balakrishnan (2009) introduced a gammagenerated family of distributions by adding an extra positive shape parameter to a baseline model. In this paper, we study some mathematical properties of the new fourparameter gamma Burr III distribution based on the gammagenerated family. We prove empirically that the proposed distribution can provide a better fit than important generated models such as the beta Burr III (Gomes et al. 2013) and beta Weibull (Lee et al. 2007) distributions. Finally, we propose a new loggamma Burr III regression model and illustrate its importance by means of one application to a real data set.
Declarations
Authors’ contributions
GMC proposed the gamma Burr III model, wrote some parts of Sections 1 to 2, and also drafted the manuscript. AEG wrote some parts of Section 2. CQdS wrote some parts of Section 2, subsections 4.1 and 4.2 and prepared Figs. 1, 2, 3 and 4. EMMO proposed the loggamma Burr III model described in Section 3, performed the application in subsection 4.3, and also prepared Fig. 5. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Alexander, C, Cordeiro, GM, Ortega, EMM, Sarabia, JM: Generalized betagenerated distributions. Computational Statistics and Data Analysis. 56, 1880–1897 (2012).View ArticleMATHMathSciNetGoogle Scholar
 Alzaatreh, A, Lee, C, Famoye, F: A new method for generating families of continuous distributions. Metron. 71, 63–79 (2013).View ArticleMATHMathSciNetGoogle Scholar
 Chen, G, Balakrishnan, N: A general purpose approximate goodnessoffit test. J Qual. Technol. 27, 154–161 (1995).Google Scholar
 Cordeiro, GM, de Castro, M: A new family of generalized distributions. J. Stat. Comput. Simul. 81, 883–898 (2011).View ArticleMATHMathSciNetGoogle Scholar
 Cordeiro, GM, Ortega, EMM, Popovic, BV, Pescim, RR: The Lomax generator of distributions: Properties, minification process and regression model. Appl. Math. Comput. 247, 465–486 (2014).MATHMathSciNetGoogle Scholar
 Cordeiro, GM, Ortega, EMM, Ramires, TG: A new generalized Weibull family of distributions: mathematical properties and applications. J. Stat. Distrib. Appl. 2, 131–145 (2015).View ArticleMATHGoogle Scholar
 Cordeiro, GM, Alizadeh, M, Ozel, G, Hossein, B, Ortega, EMM, Altun, E: The generalized odd loglogistic family of distributions: properties, regression models and applications. J. Stat. Comput. Simul. 87, 908–932 (2017).View ArticleMathSciNetGoogle Scholar
 Dagum, C: A new model of personal income distribution: specification and estimation. Econ. Appl. 30, 413–437 (1977).Google Scholar
 Eugene, N, Lee, C, Famoye, F: BetaNormal distribution and its applications. Commun. Stat. Theory Methods. 31, 497–512 (2002).View ArticleMATHMathSciNetGoogle Scholar
 Gomes, AE, daSilva, CQ, Cordeiro, GM, Ortega, EMM: The beta Burr III model for lifetime data. Braz. J. Probab. Stat. 27, 502–543 (2013).View ArticleMATHMathSciNetGoogle Scholar
 Gove, JH, Ducey, MJ, Leak, WB, Zhang, L: Rotated sigmoid structures in managed unevenaged northern hardwood stands: a look at the Burr type III distribution. Forestry (2008). doi:10.1093/forestry/cpm025.
 Gradshteyn, IS, Ryzhik, IM: Table of Integrals, Series, and Products, seventh edition. Academic Press, San Diego (2000).MATHGoogle Scholar
 Kirkpatrick, S, Gelatt Jr, CD, Vecchi, MP: Optimization by Simulated Annealing. Science. 220, 671–680 (1983).View ArticleMATHMathSciNetGoogle Scholar
 Kleiber, C, Kotz, S: Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley, New York (2003).View ArticleMATHGoogle Scholar
 Kleiber, C: A Guide to the Dagum Distributions. Springer, New York (2008).View ArticleMATHGoogle Scholar
 Klugman, SA, Panjer, HH, Willmot, GE: Loss Models. Wiley, New York (1998).MATHGoogle Scholar
 Lawless, JF: Statistical Models and Methods for Lifetime Data. John Wiley, New York (2003).MATHGoogle Scholar
 Lee, C, Famoye, F, Olumolade, O: BetaWeibull Distribution: Some Properties and Applications to Censored Data. J. Mod. Appl. Stat. Methods. 6, 173–186 (2007).View ArticleGoogle Scholar
 Lindsay, SR, Wood, GR, Woollons, RC: Modelling the diameter distribution of forest stands using the Burr distribution. J. Appl. Stat. 23, 609–619 (1996).View ArticleGoogle Scholar
 Mäkeläinen, T, Schmidt, K, Styan, GPH: On the Existence and Uniqueness of the Maximum Likelihood Estimate of a VectorValued Parameter in FixedSize. Ann. Stat. 9, 758–767 (1981).View ArticleMATHMathSciNetGoogle Scholar
 Mielke, PW: Another family of distributions for describing and analyzing precipitation data. J. Appl. Meterology. 12, 275–280 (1973).View ArticleGoogle Scholar
 Nofal, ZM, Afify, AZ, Yousof, HM, Cordeiro, M: The generalized transmutedG family of distributions. Commun. Stat. Theory Methods. 46, 4119–4136 (2016).View ArticleMATHMathSciNetGoogle Scholar
 Silva, MA, BezerraSilva, GCD, Vendramim, JD, Mastrangelo, T: Sublethaleffect of neem extract on Mediterranean fruit fly adults. Rev. Bras. Frutic. 35, 93–101 (2013).View ArticleGoogle Scholar
 Tahir, MH, Cordeiro, GM, Alizadeh, M, Mansoor, M, Zubair, M, Hamedani, GG: The odd generalized exponential family of distributions with applications. J. Stat. Distrib. Appl. 2, 1–28 (2015).View ArticleMATHGoogle Scholar
 Zografos, K, Balakrishnan, N: On families of beta and generalized gammagenerated distributions and associated inference. Stat. Methodol. 6, 344–362 (2009).View ArticleMATHMathSciNetGoogle Scholar