 Review
 Open Access
 Published:
A new extended normal regression model: simulations and applications
Journal of Statistical Distributions and Applications volume 6, Article number: 7 (2019)
Abstract
Various applications in natural science require models more accurate than wellknown distributions. In this context, several generators of distributions have been recently proposed. We introduce a new fourparameter extended normal (EN) distribution, which can provide better fits than the skewnormal and beta normal distributions as proved empirically in two applications to real data. We present Monte Carlo simulations to investigate the effectiveness of the EN distribution using the KullbackLeibler divergence criterion. The classical regression model is not recommended for most practical applications because it oversimplifies real world problems. We propose an EN regression model and show its usefulness in practice by comparing with other regression models. We adopt maximum likelihood method for estimating the model parameters of both proposed distribution and regression model.
Introduction
In recent years, several methods for generating new models from classic distributions have been proposed. A detailed study about “the evolution of methods for generalizing classic distributions” was made by Lee et al. (2013). A generalization of the standard normal distribution is sought because it can provide more accurate statistical models and inferential procedures. For instance, the beta normal distribution was pioneered by Eugene et al. (2002), who discussed some of its structural properties.
Additionally, the beta generalized normal (BGN) distribution was proposed by Cintra et al. (2013) to extend the beta normal distribution. They applied the BGN model to the synthetic aperture radar image processing. This paper presents a new extended normal (EN) distribution based on the family introduced by Cordeiro et al. (2013).
For any continuous cumulative distribution function (cdf) G(x), Cordeiro et al. (2013) defined the cdf of the exponentiated generalized (EG) family by
where a>0 and b>0 are two additional shape parameters whose role is to generate distributions with heavier/lighter tails and provide wider ranges for skewness and kurtosis. These parameters are sought as a manner to furnish a more flexible distribution.
Because of its tractable cdf (1), the EG family can be used quite effectively even if the data are censored. This family is capable to return univariate models for any type of support. Further, it allows for greater flexibility of its tails and can be widely applied in many areas such as engineering and biology.
Its probability density function (pdf) has a very simple form
An important advantage of the density (2) is its ability of fitting skewed data that can not be often fitted by existing distributions. Based on the cdf G(x) and pdf g(x) of any baseline G distribution, we can associate the EGG pdf (2) with two extra parameters. The EG family can be used for discriminating between the G and EGG distributions.
The baseline distribution G(x) is a special case of (2) when a=b=1. For a=1, it gives the exponentiatedG (“ExpG”) class. If b=1, we obtain the Lehmann type IIG (LTIIG) class. Eq. (2) generalizes both Lehmann types I and II alternative classes (Lehmann 1953). In fact, this equation can be defined as the exponentiated generator applied to the LTIIG class.
Note that even if g(x) is a symmetric density, the density f(x) will not be symmetric. The cdf (1) has tractable properties especially for simulations, since its quantile function (qf) has a simple form
where Q_{G}(u) is the baseline qf.
This paper is outlined as follows. In Section 2, we define the EN distribution and provide plots of its density function. A linear representation for the EN density function is derived in Section 3. We obtain an explicit expression for its moments in Section 4. In Section 5, we provide the maximum likelihood estimates (MLEs) of the parameters. In Section 6, we define the EN regression model and discuss the estimation of the model parameters. In Section 7, we perform some simulations and present three applications to real data sets. Finally, some concluding remarks are addressed in Section 8.
The EN distribution
Due to the analytical tractability of its pdf and its importance in asymptotic theory (such as the central limit theorem and delta mehtod), the normal distribution is the most popular model distribution in applications to real data with support in \( \mathbb {R}\).
When the number of observations is large, it can serve as an approximate distribution for several other models. The normal N (μ,σ) pdf (for \(x \in \mathbb {R}\)) is
where \(\mu \in \mathbb {R}\) is a mean parameter, σ>0 is a scale parameter and \(\phantom {\dot {i}\!}\phi (x)\,=\,(2\pi)^{1/2}\,\mathrm {e}^{x^{2}/2}\) is the standard normal pdf.
Its cdf has the form
where \(\Phi (x)\,=\,\int _{\infty }^{x}\,\phi (t)\,\mathrm {d}t\) is the standard normal cdf.
By inserting (3) and (4) in Eqs. (1) and (2), the cdf and pdf of the EN distribution (for \(x \in \mathbb {R}\)) can be expressed, respectively, as
and
Hereafter, a random variable X having density (6) is denoted by X∼EN(a,b,μ,σ). Evidently, this density does not involve any complicated function and the normal distribution arises as the basic exemplar when a=b=1. It is a positive point of the current generalization. Moreover, the qf of X is
where
In next sections, other moment results are proved. Moreover, from the previous qf of the EN distribution, the associated median, say M, is
where z_{a,b}=Φ^{−1}(1 − [1 − 2^{−1/b}]^{1/a}) is the standard normal quantile at 1 − [1 − 2^{−1/b}]^{1/a}. Thus, the next function suggests a symmetric discussion:
We motivate the paper by comparing the performances of the EN, normal, skewnormal (SN) and betanormal (BN) models fitted to two real data sets. Figure 1 displays possible shapes of the density function (6) for some parametervalues. We can note the flexibility of the EN distribution with respect to the normal distribution.
Linear representation
A useful linear representation for (2) can be derived using the concept of exponentiated distributions. For an arbitrary baseline cdf G(x), a random variable T is said to have the exponentiatedG (ExpG) distribution with power parameter a>0, say T∼ExpG (a), if its pdf and cdf are
respectively. Several properties of the exponentiated distributions have been studied by some authors recently such as those for the exponentiated Weibull (Mudholkar and Srivastava 1993) and exponentiated generalized gamma (Cordeiro et al. 2013) distributions.
Theorem 1
Let X∼ EN (a,b,μ,σ). The pdf of X can be written as
where h_{j+1}(x) is the exponentiatednormal (ExpN) density with power parameter j+1, say Exp N (μ,σ,j+1), namely
The proof of this theorem is given in Appendix A.
It is possible to verify using symbolic software (such as Maple) that \(\sum _{j=0}^{\infty } \,w_{j+1}=1\) as expected.
Equation (7) is the main result of this section. It reveals that the EN density is a linear combination of ExpN densities. So, several mathematical properties of the proposed distribution can then be obtained from those of the ExpN distribution using previous results given by Rêgo et al. (2012).
Moments
First, we determine the probability weighted moments (PWMs) of the standard normal distribution since they are required for the ordinary moments of the EN distribution. The standard normal PWMs are defined by
for n≥0 and j≥0 integers.
The result holds
Applying the binomial expansion and interchanging terms gives
Based on the power series for the error function
we can obtain τ_{n,j} from Eqs. (9)–(11) given by Nadarajah (2008).
For n+j−r even, we have
where \(F_{A}^{(jr)}(\cdot)\) is the Lauricella function of type A. See, for example, Exton (1978)^{Footnote 1}. If n+k−j is odd, the corresponding terms in τ_{n,j} vanish.
Corollary 1
Suppose that \(\mu _{n}^{\prime }= E(X^{n})\) exists. Then,
where τ_{n,j} is given by (8).
The skewness and kurtosis of X can be computed from Q_{EN}(p) using Bowley and Moors wellknown quantities. Figure 2 displays plots of the skewness and kurtosis measures of X for selected values of a and b. We note that the skewness and kurtosis values for the normal distribution are obtained when values for (a,b) tend to (1,1).
Estimation
Consider a random variable X∼EN(a, b, μ, σ) and let θ=(a, b, μ, σ)^{⊤} be the model parameters, where (·)^{⊤} is the transposition operator. Thus, the associated loglikelihood function for one observation x is
Given a data set x_{1},…,x_{n}, the MLE of θ is determined by maximizing \( \ell _{n}(\boldsymbol {\theta })\,=\,\sum _{i=1}^{n}\,\ell (\boldsymbol {\theta };x_{i}). \)
Based on Eq. (10), the score vector is
whose components are
and
An advantage of the EN distribution is that the MLE \(\widehat {b}\) has a partially closedform expression. Suppose that the observed information matrix is nonnegative definite. The MLE of b can be expressed in terms of the MLEs \(\widehat {a},\widehat {\mu }\) and \(\widehat {\sigma }\) as
This fact is important at least for two reasons. The estimates become the solutions of a system with three equations and three variables (say “(3,3) system”) instead of a (4,4) system. Further, Eq. (11) clarifies the relationship of \(\widehat {b}\) with \(\widehat {a}\), \(\widehat {\mu }\) and \(\widehat {\sigma }\). More details are described in the simulation section.
Additionally, in order to make inference on the model parameters, the total observed information matrix is J(θ)={−U_{rs}}, where U_{rs}=∂^{2} ℓ(θ)/∂θ_{r} ∂θ_{s}, for r,s∈{a,b,μ,σ}. By differentiating the score function, we obtain the Hessian matrix elements U_{rs} given in Appendix B.
The EN regression model
The classical normal linear regression model is usually applied in science and engineering to describe symmetrical data for which linear functions of unknown parameters are used to explain the phenomena under study. However, it is wellknown that several phenomena are not always in agreement with the classical regression model due to lack of symmetry and/or the presence of heavy and lightly tails in the empirical distribution. As an alternative to overcome this shortcoming, we propose a new regression model based on the EN distribution thus extending the normal linear regression.
Let v_{i}=(v_{i1},…,v_{ip})^{⊤} be the p×1 explanatory variable vector associated with the ith response variable x_{i} (for i=1,…,n). Let X_{i} be a response variable having the EN distribution given by (6) reparameterized as
where the random error Z∼EN(a,b,0,1) has the standardized EN distribution, β=(β_{1},…,β_{p})^{⊤} is the unknown vector of coefficients, σ>0 is an unknown dispersion parameter and v_{i} is the explanatory vector modeling the location parameter \(\mu _{i}=\mathbf {v}_{i}^{\top } \boldsymbol {\beta }\).
Hence, the location parameter vector μ=(μ_{1},…,μ_{n})^{⊤} of the EN regression model has the linear structure μ=Vβ, where V=[v_{1}…v_{n}]^{⊤} is a known model matrix.
The EN regression model (12) opens new possibilities for fitting many different types of data, since the EN distribution is much more flexible then the normal distribution. The most important special regressions are:

For a=1, it gives the exponentiatednormal (ExpN) regression model, which has not been explored, but it can be understood as a regression under the power normal distribution pioneered by Kundu and Gupta (2013).

For b=1, it reduces to the LTIInormal (LTIIN) regression model defined as a linear model under the LTIIN distribution.

If a=b=1, it reduces to the normal linear regression.
For statistical inference on the EN regression model, we consider a sample (X_{1},v_{1}),…,(X_{n},v_{n}) of n independent observations. The loglikelihood function for the vector of parameters η=(a,b,σ,β^{⊤})^{⊤} of model (12) is
where \(z_{i}=({x_{i}\mathbf {v}_{i}^{\top }\boldsymbol {\beta }})/\sigma \) and x_{i} is a possible outcome of X_{i}.
The components of the score vector U(η) are
where j=1,…,p.
Note that a closedform expression for the MLE \(\widehat {\boldsymbol {\eta }}\) is analytically intractable and, therefore, its computation has to be performed numerically by means of a nonlinear optimization algorithm.
We can maximize the loglikelihood function (13) based on the NewtonRaphson method. In particular, we use the matrix programming language Ox (MaxBFGS function) (see Doornik 2007) to calculate \(\widehat {{\boldsymbol {\eta }}}\). Initial values for β and σ can be taken from the fit of the classical regression model (a=b=1).
Under general regularity conditions, the asymptotic distribution of \((\widehat {\boldsymbol {\eta }}{\boldsymbol {\eta }})\) is multivariate normal N_{p+3}(0,K(η)^{−1}), where K(η) is the expected information matrix. These conditions can be found in Cox and Hinkley’s Theoretical Statistics book (1974). The asymptotic covariance matrix K(η)^{−1} of \(\widehat {{\boldsymbol {\eta }}}\) can be approximated by the inverse of the (p+3)×(p+3) observed information matrix J(η) and then the inference on the parameter vector η can be based on the normal approximation N_{p+3}(0,J(η)^{−1}) for \(\widehat {{\boldsymbol {\eta }}}\).
Besides estimation of the model parameters, hypotheses tests can be considered using likelihood ratio (LR) statistics.
Numerical results
Three studies are presented in this section. First, we perform a Monte Carlo simulation study. Subsequently, two applications to real data show the potential uses of the new distribution. Third, the usefulness of the proposed regression model in Section 6 is proved empirically based on quality of life data.
7.1 Simulation study
Here, we provide a Monte Carlo simulation study in order to quantify the effectiveness of the EN distribution based on the symmetrized KullbackLeibler divergence as a goodnessoffit comparison criterion.
Initially, we provide a brief discussion on the KullbackLeibler divergence. According to Cover and Thomas (1991), this measure is the quantification of the error by assuming that the Y model is true when the data follow the X distribution. For example, it has been proposed as essential parts of test statistics and strongly applied to contexts of radar synthetic aperture image processing in both univariate (Nascimento et al. 2010) and polarimetric (or multivariate) (Nascimento et al. 2014) perspectives.
In order to work with measures which satisfy nonnegativity, symmetry and definiteness properties, Nascimento et al. (2010) considered the measure d_{KL}, namely
Figure 3 displays both functions IntegrandKL(x,y) and d_{KL}(X,Y) at the parametric point [a,b,μ,σ]=[a,b,0,1] when a,b=4,5,6. It is noticeable that this measure can be understood as a distance between the two points– θ_{1}=(a_{1},b_{1},μ_{2},σ_{1}) and θ_{2}=(a_{2},b_{2},μ_{2},σ_{2})–in the parametric space, say d_{KL}(θ_{1},θ_{2}).
For increasing values of ε, the IntegrandKL (X,Y) has different forms. Further, IntegrandKL (X,Y)→0 when ε→0.
Figure 3b and c reveal the influence of a and b, respectively, when we employ a perturbation in each parameter under (μ,σ)=(0,1). As expected, when the value of ε increases, the distance d _{KL} also increases in both cases. However, this distance is most evident when we take smaller negative values of ε.
Table 1 gives the asymptotic performance of the maximum likelihood procedure discussed in the previous section with respect to the KullbackLeibler distance, where we identify critical scenarios under the parametric space, which can require a harder maximum likelihood estimation. The results support the fact: “when we wish to estimate one additional parameter (a or b) given that the MLE for the other parameter is known and higher than one, then the biases of the estimates tend to increase for high values of the parameter of interest.” In particular, at the MLE of b given \(\widehat {a}\), the above information finds strong justification in Eq. (11). Based on this equation, when \(\widehat {a}\) takes high values, the MLE of b collapses for an indetermination algebraic.
7.2 Two applications to real data
Here, we perform two applications to real data sets. First, we consider the data the strengths of glass fibres analyzed by Jones and Faddy (2004). These data were obtained at the National Physical Laboratory (UK) to explain the breaking strength of sixty three glass fibres having length 1.5 cm.
As a second application, we consider the fatigue life data (Meeker and Escobar 1998) for sixty seven specimens of Alloy T7987 that failed before having accumulated three hundred thousand cycles of testing. The data set was rounded to the nearest thousand cycles.
We prove empirically the efficiency of the EN distribution versus the normal, skewnormal (SN) (Azzalini 1984) and beta normal (BN) (Eugene et al. 2002) distributions.
The SN density [ T∼SN(a,μ,σ)] has the form (for \(x,\,a,\,\mu \in \mathbb {R}\) and σ>0)
and the BN density [ T∼BN(α,β,μ,σ)] is (for \(x,\,\mu \in \mathbb {R}\) and α,β,σ>0)
where J=Γ(α+β)/[Γ(α) Γ(β) σ].
We compare the distributions using three goodnessoffit (GoF) measures: AndersonDarling (A ^{∗}), CramerVon Mises (W ^{∗}) and KolmogorovSmirnov (KS) statistics. We adopt the goodness.fit function from the R program through the BFGS method. According to detailed discussion in Quang (1989), these measures are more indicated than the Akaike information criterion (AIC) and Bayesian information criterion (BIC) or some of their variations, which are more useful for nested models. Table 2 gives the GoF measures for each fitted distribution with respect to both data sets.
The GoF’s measures for the EN distribution correspond to the lowest values among the discrimination criteria (highlighted in Table 2). These results provide evidence that the EN distribution is the most suitable model (among those considered) to describe both data sets.
7.3 Application for regression models
We assess changes on the oral healthrelated quality of life (OHRQL) of schoolchildren. To that end, a followup exam of three years was made to evaluate the impact of caries incidence on the OHRQL of adolescents. The data were obtained from a study (for more details, see Paula et al. 2012) developed by the Department of Community Dentistry, Division of Health Education and Health Promotion, Piracicaba Dental School, University of CampinasUNICAMP.
The variables employed are (for i=1,…,291):

x_{i}: overall score of the OHRQL at time of follow up;

v_{i1}: number of teeth decayed, missing and filled (TDMF)
(0=without TDMF increment; 1=with TDMF increment).
We analyze these data based on the EN regression model
where the errors Z_{i}’s are independent random variables having the EN (a,b,0,1) distribution.
The gammanormal (GN) (Lima et al. 2015) distribution extends the normal distribution and can be used to fit data that come from a distribution with heavy tails reducing the influence of aberrant observations. The GN density with location parameter \(\mu \in \mathbb {R}\), dispersion parameter σ>0 and shape parameter a>0 takes the form
Further, the EN regression model is compared with the ExpN, LTIIN, normal and GN regression models. Table 3 provides the MLEs of the parameters for the EN regression and these models.
Iterative maximization of the loglikelihood function (13) starts with initial values for β and σ taken from the fit of the classical regression model (a=b=1). In general, all fitted regression models reveal that v_{1} is significant at a 1% level of significance and that there is a significant difference between the levels of the numbers of teeth decayed, missing and filled. As expected, we find reciprocal relations between \(\mu _{i}=\mathbb {E}(X_{i})\) and v_{1i} in the EN, LTIIN, GN and normal regression models, except for the ExpN regression (whichalthough well adjusteddoes not seem to be a coherent model). On the other hand, based on the estimates of σ, the EN regression model reveals advantages in relation to the other models.
The values of the AIC, Consistent Akaike Information Criterion (CAIC) and BIC to compare the fitted models are given in Table 4.
It is clear that the EN regression model outperforms the other regressions irrespective of the criteria and then we can conclude that the new regression model can be used effectively in the analysis of the current data set. A comparison of the proposed regression model with some of its submodels using LR statistics is addressed in Table 5.
The figures in this table, specially the pvalues, indicate that the EN regression model yields a better fit to these data than the other submodels.
A graphical comparison among the fitted regression models is reported in Figure 4. The plots of these curves are the empirical cdf and the estimated cdf. Based on these plots, it is evident that the EN regression model provides a superior fit to the current data.
Conclusions
Flexible statistical distributions have been sought for describing data from practical situations in which the use of classical ones is not recommended. In this paper, we propose an extension of the normal distribution based on the exponentiated generalized family defined by Cordeiro et al. (2011), which adds two extra shape parameters to a baseline distribution. We provide some structural properties of the new extended normal (EN) distribution. The model parameters are estimated by maximum likelihood. The efficiency of this distribution is illustrated by means of two applications to real data sets. There is a clear evidence that the EN distribution outperforms the skewnormal distribution and can be a competitive alternative to the beta normal distribution. The classical regression model does not produce good results in many real problems, and for this reason several extensions have arisen in recent years. We propose a new regression model based on the EN distribution and prove its importance in real applications. This new regression model opens a wide range of research topics following the basic inference concepts of the normal linear regression model.
Appendix A: Proof for the Theorem 3.1
We consider the power series
which holds for any real noninteger b and z<1. Using this generalized binomial expansion twice in Eq. (1), we can write the EGG cumulative distribution as
where \(w_{j+1}= \sum _{m=1}^{\infty } (1)^{j+m+1}\,{b \choose m}\, {m\,a \choose j+1}\) and H_{j+1}(x) is the ExpG cdf with power parameter j+1. By differentiating the last equation, we obtain (7).
Appendix B: The Hessian matrix
The elements of the Hessian matrix are:
Availability of data and materials
Possible interested readers can contact authors.
Notes
 1.
Exton H. Handbook of hypergeometric integrals: theory, applications, tables, computer programs, 1978
References
Azzalini, A: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1984).
Cintra, RJ, Cordeiro, GM, Nascimento, ADC: Beta generalized normal distribution with an application for SAR. Image Process. 48, 1–16 (2013).
Cordeiro, GM, Cunha, DCC, Ortega, EMM: The exponentiated generalized class of distributions. J. Data Sci. 11, 777–803 (2013).
Cordeiro, GM, Ortega, EMM, Silva, GO: The exponentiated generalized gamma distribution with application to lifetime data. J. Stat. Comput. Simul. 81, 827–842 (2011).
Cover, TM, Thomas, JA, Ortega, EMM: Elements of Information Theory. WileyInterscience, New York (1991).
Doornik, JA: An ObjectOriented Matrix Language Ox 5. Timberlake Consultants Press, London (2007).
Eugene, N, Lee, C, Famoye, F: Betanormal distribution and its applications. Commun. Stat.Theory Methods. 31, 497–512 (2002).
Frery, AC, Nascimento, ADC, Cintra, RJ: Analytic Expressions for Stochastic Distances Between Relaxed Complex Wishart Distributions. IEEE Trans. Geosci. Remote Sens. 52, 1213–1226 (2014).
Jones, M, Faddy, MJ: A skew extension of the tdistribution, with applications. Biom. J. 65, 159–174 (2004).
Lee, C, Famoye, F, Alzaatreh, AY: Methods for generating families of univariate continuous distributions in the recent decades. Wiley Interdiscip. Rev. Comput. Stat. 5, 219–238 (2013).
Lehmann, EL: The power of rank tests. Ann. Math. Statist. 24, 23–43 (1953).
Lima, MCS, Cordeiro, GM, Ortega, EMM: A new extendion of the normal distribution. J. Data Sci. 3, 385–408 (2015).
Meeker, WQ, Escobar, L: Statistical Methods for Reliability Data. Wiley, New York (1998).
Mudholkar, GS, Srivastava, DK: Exponentiated Weibull family for analyzing bathtub failurereal data. IEEE Trans. Reliab. 42, 299–302 (1993).
Nadarajah, S: Explicit expressions for moments of order statistics. Statistics and Probability Letters. 78, 196–205 (2008).
Nascimento, ADC, Cintra, RJ, Frery, AC: Hypothesis Testing in Speckled Data with Stochastic Distances. IEEE Trans. Geosci. Remote Sens. 48, 373–385 (2010).
Paula, JS, Oliveira, M, Soares, MSP, Chaves, MGAM, Mialhe, FL: Perfil Epidemiológico dos Pacientes Atendidos no Pronto Atendimento da Faculdade de Odontologia da Universidade Federal de Juiz de Fora. Arquivos em Odontologia (UFMG). 48, 257–262 (2012).
Quang, HV: Likelihood Ratio Tests for Model Selection and NonNested Hypotheses. Econometrica. 57, 307–333 (1989).
Rêgo, LC, Cintra, RJ, Cordeiro, GM: On some properties of the beta normal distribution. Commun. Stat.  Theory Methods. 41, 3722–3738 (2012).
Acknowledgements
The authors would like to thank the financial support of CNPq and FACEPE, Brazil.
Funding
Not applicable.
Author information
Author notes
Affiliations
Contributions
The authors, viz MCSL, GMC, EMMO and ADCN with the consultation of each other carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Maria C.S. Lima.
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 KullbackLeibler divergence criterion
 Maximum likelihood procedures
 Monte Carlo simulation
 Normal distribution
 Regression
AMS Subject Classification
 Primary 60E05
 secondary 62N05
 62F10