A new extended normal regression model: simulations and applications

Lima, Maria C.S.; Cordeiro, Gauss M.; Ortega, Edwin M.M.; Nascimento, Abraão D.C.

doi:10.1186/s40488-019-0098-y

Review
Open access
Published: 08 June 2019

A new extended normal regression model: simulations and applications

Maria C.S. Lima ORCID: orcid.org/0000-0002-5480-3103¹^na1,
Gauss M. Cordeiro¹^na1,
Edwin M.M. Ortega²^na1 &
…
Abraão D.C. Nascimento³^na1

Journal of Statistical Distributions and Applications volume 6, Article number: 7 (2019) Cite this article

3187 Accesses
1 Citations
1 Altmetric
Metrics details

Abstract

Various applications in natural science require models more accurate than well-known distributions. In this context, several generators of distributions have been recently proposed. We introduce a new four-parameter extended normal (EN) distribution, which can provide better fits than the skew-normal and beta normal distributions as proved empirically in two applications to real data. We present Monte Carlo simulations to investigate the effectiveness of the EN distribution using the Kullback-Leibler divergence criterion. The classical regression model is not recommended for most practical applications because it oversimplifies real world problems. We propose an EN regression model and show its usefulness in practice by comparing with other regression models. We adopt maximum likelihood method for estimating the model parameters of both proposed distribution and regression model.

Introduction

In recent years, several methods for generating new models from classic distributions have been proposed. A detailed study about “the evolution of methods for generalizing classic distributions” was made by Lee et al. (2013). A generalization of the standard normal distribution is sought because it can provide more accurate statistical models and inferential procedures. For instance, the beta normal distribution was pioneered by Eugene et al. (2002), who discussed some of its structural properties.

Additionally, the beta generalized normal (BGN) distribution was proposed by Cintra et al. (2013) to extend the beta normal distribution. They applied the BGN model to the synthetic aperture radar image processing. This paper presents a new extended normal (EN) distribution based on the family introduced by Cordeiro et al. (2013).

For any continuous cumulative distribution function (cdf) G(x), Cordeiro et al. (2013) defined the cdf of the exponentiated generalized (EG) family by

$$\begin{array}{@{}rcl@{}} F(x) = \left[1- \left\{1 - G(x)\right\}^{a}\right]^{b}, \end{array} $$

(1)

where a>0 and b>0 are two additional shape parameters whose role is to generate distributions with heavier/lighter tails and provide wider ranges for skewness and kurtosis. These parameters are sought as a manner to furnish a more flexible distribution.

Because of its tractable cdf (1), the EG family can be used quite effectively even if the data are censored. This family is capable to return univariate models for any type of support. Further, it allows for greater flexibility of its tails and can be widely applied in many areas such as engineering and biology.

Its probability density function (pdf) has a very simple form

$$ f(x) =a\,b\,\left\{1 - G(x)\right\}^{a-1}\,\left[1-\left\{1- G(x)\right\}^{a}\right]^{b -1}\,g(x). $$

(2)

An important advantage of the density (2) is its ability of fitting skewed data that can not be often fitted by existing distributions. Based on the cdf G(x) and pdf g(x) of any baseline G distribution, we can associate the EG-G pdf (2) with two extra parameters. The EG family can be used for discriminating between the G and EG-G distributions.

The baseline distribution G(x) is a special case of (2) when a=b=1. For a=1, it gives the exponentiated-G (“Exp-G”) class. If b=1, we obtain the Lehmann type II-G (LTII-G) class. Eq. (2) generalizes both Lehmann types I and II alternative classes (Lehmann 1953). In fact, this equation can be defined as the exponentiated generator applied to the LTII-G class.

Note that even if g(x) is a symmetric density, the density f(x) will not be symmetric. The cdf (1) has tractable properties especially for simulations, since its quantile function (qf) has a simple form

$$x\,=\,Q_{G}\left\{\,\left[\,1\,-\,\left(1\,-\,u^{\frac 1b}\right)^{\frac 1a}\,\right]\,\right\}, $$

where Q_G(u) is the baseline qf.

This paper is outlined as follows. In Section 2, we define the EN distribution and provide plots of its density function. A linear representation for the EN density function is derived in Section 3. We obtain an explicit expression for its moments in Section 4. In Section 5, we provide the maximum likelihood estimates (MLEs) of the parameters. In Section 6, we define the EN regression model and discuss the estimation of the model parameters. In Section 7, we perform some simulations and present three applications to real data sets. Finally, some concluding remarks are addressed in Section 8.

The EN distribution

Due to the analytical tractability of its pdf and its importance in asymptotic theory (such as the central limit theorem and delta mehtod), the normal distribution is the most popular model distribution in applications to real data with support in $ \mathbb {R}$.

When the number of observations is large, it can serve as an approximate distribution for several other models. The normal N (μ,σ) pdf (for $x \in \mathbb {R}$) is

$$\begin{array}{@{}rcl@{}} g(x;\mu,\sigma)\,=\,\frac{1}{\sqrt{2\,\pi}\,\sigma}\,\mathrm{e}^{-\frac 12\left(\frac{x-\mu}{\sigma}\right)^{2}} \,=\,\frac{1}{\sigma}\phi\left(\frac{x-\mu}{\sigma}\right), \end{array} $$

(3)

where $\mu \in \mathbb {R}$ is a mean parameter, σ>0 is a scale parameter and $\phantom {\dot {i}\!}\phi (x)\,=\,(2\pi)^{-1/2}\,\mathrm {e}^{-x^{2}/2}$ is the standard normal pdf.

Its cdf has the form

$$\begin{array}{@{}rcl@{}} G(x;\mu,\sigma)\,=\,\int_{-\infty}^{x}\,g(t;\mu,\sigma)\,\mathrm{d}t\,=\,\Phi\left(\frac{x-\mu}{\sigma}\right), \end{array} $$

(4)

where $\Phi (x)\,=\,\int _{-\infty }^{x}\,\phi (t)\,\mathrm {d}t$ is the standard normal cdf.

By inserting (3) and (4) in Eqs. (1) and (2), the cdf and pdf of the EN distribution (for $x \in \mathbb {R}$) can be expressed, respectively, as

$$\begin{array}{@{}rcl@{}} F(x)\,=\,\left[\,1\,-\, \left\{1 - \Phi\left(\frac{x-\mu}{\sigma}\right)\right\}^{a}\,\right]^{b} \end{array} $$

(5)

and

$$\begin{array}{@{}rcl@{}} f(x)&=&\frac{a\,b}{\sigma}\,\left\{1 - \Phi\left(\frac{x-\mu}{\sigma}\right)\right\}^{a-1}\,\left[1-\left\{1- \Phi\left(\frac{x-\mu}{\sigma}\right)\right\}^{a}\right]^{b -1}\\ &&\times\phi\left(\frac{x-\mu}{\sigma}\right). \end{array} $$

(6)

Hereafter, a random variable X having density (6) is denoted by X∼EN(a,b,μ,σ). Evidently, this density does not involve any complicated function and the normal distribution arises as the basic exemplar when a=b=1. It is a positive point of the current generalization. Moreover, the qf of X is

$$\begin{array}{@{}rcl@{}} Q_{\text{EN}}(p) \,=\, \mu\,+\,\sigma\,\Phi^{-1}\left(1\,-\,\left[1\,-\,p^{\frac{1}{b}}\right]^{\frac{1}{a}}\right). \end{array} $$

$${\kern90pt}m=\,E(X)\,=\,\sigma\,a\,b\,\mathcal{I}_{a,b}\,+\,\mu, $$

where

$$\mathcal{I}_{a,b}\,=\, \int_{-\infty}^{\infty} z\,\phi(z)\,[1\,-\,\Phi(z)]^{a-1}\,\left\{1\,-\,[1\,-\,\Phi(z)]^{a}\right\}^{b-1}\,\mathrm{d}z. $$

In next sections, other moment results are proved. Moreover, from the previous qf of the EN distribution, the associated median, say M, is

$$M=\,Q_{\text{EN}}(1/2)\,=\,\sigma\,z_{a,b}\,+\,\mu, $$

where z_a,b=Φ⁻¹(1 − [1 − 2^−1/b]^1/a) is the standard normal quantile at 1 − [1 − 2^−1/b]^1/a. Thus, the next function suggests a symmetric discussion:

$$\left\{ \begin{array}{l} \text{right asymmetry}, \quad \text{if }z_{a,b} > a\,b\,\mathcal{I}_{a,b}\\ \text{symmetry}, \quad \text{if }z_{a,b} = a\,b\,\mathcal{I}_{a,b}\\ \text{left asymmetry}, \quad \text{if }z_{a,b} < a\,b\,\mathcal{I}_{a,b}. \end{array} \right. $$

We motivate the paper by comparing the performances of the EN, normal, skew-normal (SN) and beta-normal (BN) models fitted to two real data sets. Figure 1 displays possible shapes of the density function (6) for some parametervalues. We can note the flexibility of the EN distribution with respect to the normal distribution.

Linear representation

A useful linear representation for (2) can be derived using the concept of exponentiated distributions. For an arbitrary baseline cdf G(x), a random variable T is said to have the exponentiated-G (Exp-G) distribution with power parameter a>0, say T∼Exp-G (a), if its pdf and cdf are

$$\begin{array}{@{}rcl@{}} H_{a}(x)\,=\,G^{a}(x)\,\,\,\,\text{and}\,\,\,\, h_{a}(x)\,=\,a\,g(x)\,G^{a-1}(x), \end{array} $$

respectively. Several properties of the exponentiated distributions have been studied by some authors recently such as those for the exponentiated Weibull (Mudholkar and Srivastava 1993) and exponentiated generalized gamma (Cordeiro et al. 2013) distributions.

Theorem 1

Let X∼ EN (a,b,μ,σ). The pdf of X can be written as

$$\begin{array}{@{}rcl@{}} f(x) = \sum_{j=0}^{\infty} \,w_{j+1}\,h_{j+1}(x), \end{array} $$

(7)

where h_j+1(x) is the exponentiated-normal (Exp-N) density with power parameter j+1, say Exp- N (μ,σ,j+1), namely

$$\begin{array}{@{}rcl@{}} h_{j+1}(x)=\frac{(j+1)}{\sigma}\,\phi\left(\frac{x-\mu}{\sigma}\right)\,\Phi\left(\frac{x-\mu}{\sigma}\right)^{j}. \end{array} $$

The proof of this theorem is given in Appendix A.

It is possible to verify using symbolic software (such as Maple) that $\sum _{j=0}^{\infty } \,w_{j+1}=1$ as expected.

Equation (7) is the main result of this section. It reveals that the EN density is a linear combination of Exp-N densities. So, several mathematical properties of the proposed distribution can then be obtained from those of the Exp-N distribution using previous results given by Rêgo et al. (2012).

Moments

First, we determine the probability weighted moments (PWMs) of the standard normal distribution since they are required for the ordinary moments of the EN distribution. The standard normal PWMs are defined by

$$\tau_{n,j}\,=\,\int_{-\infty}^{\infty}\,z^{n}\,\Phi(z)^{j}\,\phi(z)\,\mathrm{d}z, $$

for n≥0 and j≥0 integers.

The result holds

$$\Phi(z)=\frac{1}{2}\left\{1+ \text{erf}\left(\frac{z}{\sqrt{2}}\right)\right\},\quad z \in \mathbb{R}. $$

Applying the binomial expansion and interchanging terms gives

$$\tau_{n,j}=\frac{1}{2^{j}\sqrt{2\pi}}\,\sum_{m=0}^{j} {j \choose m}\,\int_{-\infty}^{\infty} z^{n}\, \text{erf}\left(\frac{z}{\sqrt 2}\right)^{j-m}\,\exp\left(-\frac{z^{2}}{2}\right)dz. $$

Based on the power series for the error function

$$\text{erf}(z)=\frac{2}{\sqrt{\pi}}\sum_{r=0}^{\infty} \frac{(-1)^{r} z^{2r+1}}{(2r+1)\,r!}, $$

we can obtain τ_n,j from Eqs. (9)–(11) given by Nadarajah (2008).

For n+j−r even, we have

$$\begin{array}{@{}rcl@{}} \tau_{n,j}&=& 2^{n/2}\,\pi^{-(j+1/2)} \sum\limits_{\overset{r=0}{(n+j-r)\,\text{even}}}^{j}\, {j \choose r}\, \left(\frac{\pi}{2}\right)^{r}\,\Gamma\left(\frac{n+j-r+1}{2}\right)\times \\ && F_{A}^{(j-r)} \left(\frac{n+j-r+1}{2};\frac{1}{2},\ldots,\frac{1}{2};\frac{3}{2},\ldots,\frac{3}{2};-1,\ldots,-1 \right), \end{array} $$

(8)

where $F_{A}^{(j-r)}(\cdot)$ is the Lauricella function of type A. See, for example, Exton (1978)^{Footnote 1}. If n+k−j is odd, the corresponding terms in τ_n,j vanish.

Corollary 1

Suppose that $\mu _{n}^{\prime }= E(X^{n})$ exists. Then,

$$\begin{array}{@{}rcl@{}} \mu_{n}^{\prime}=\mathrm{E}(X^{n})=\sum_{j=0}^{\infty} (j+1)\,w_{j+1}\,\tau_{n,j}, \end{array} $$

(9)

where τ_n,j is given by (8).

The skewness and kurtosis of X can be computed from Q_EN(p) using Bowley and Moors well-known quantities. Figure 2 displays plots of the skewness and kurtosis measures of X for selected values of a and b. We note that the skewness and kurtosis values for the normal distribution are obtained when values for (a,b) tend to (1,1).

Estimation

Consider a random variable X∼EN(a, b, μ, σ) and let θ=(a, b, μ, σ)^⊤ be the model parameters, where (·)^⊤ is the transposition operator. Thus, the associated log-likelihood function for one observation x is

$$\begin{array}{@{}rcl@{}} \ell(\boldsymbol{\theta};x)&\,=\,\log(a)\,+\,\log(b)\,-\,\log(\sigma) \,+\,(a\,-\,1)\,\log\left[1\,-\,\Phi\left(\frac{x\,-\,\mu}{\sigma}\right)\right] \\ &\,+\,(b\,-\,1)\log\left\{\,1\,-\,\left[\,1\,-\,\Phi\left(\frac{x\,-\,\mu}{\sigma}\right)\,\right]^{a}\right\} \,+\,\log\left[\phi\left(\frac{x\,-\,\mu}{\sigma}\right)\right]. \end{array} $$

(10)

Given a data set x₁,…,x_n, the MLE of θ is determined by maximizing $ \ell _{n}(\boldsymbol {\theta })\,=\,\sum _{i=1}^{n}\,\ell (\boldsymbol {\theta };x_{i}). $

Based on Eq. (10), the score vector is

$$\begin{array}{@{}rcl@{}} \boldmath{U}_{\theta}&=&(\,U_{a},\,U_{b},\,U_{\mu},\,U_{\sigma}\,)^{\top}\\ &=&\left(\,\frac{\partial\,\ell_{n}(\boldsymbol{\theta})}{\partial a}, \,\,\,\frac{\partial\,\ell_{n}(\boldsymbol{\theta})}{\partial b}, \,\,\,\frac{\partial\,\ell_{n}(\boldsymbol{\theta})}{\partial \mu}, \,\,\,\frac{\partial\,\ell_{n}(\boldsymbol{\theta})}{\partial \sigma} \,\right)^{\top}, \end{array} $$

whose components are

$$\begin{array}{@{}rcl@{}} U_{a} &=& \frac na\,+\,\sum_{i=1}^{n}\,\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]\\ &&-\,(b\,-\,1)\,\sum_{i=1}^{n}\, \left[ \frac{\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) \right]}{1\,-\,\left[\,1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\right]^{a}}\right], \end{array} $$

$$U_{b}\,=\,\frac nb \,+\,\sum_{i=1}^{n}\,\log\left\{1\,-\,\left[\,1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\right]^{a}\right\}, $$

$$\begin{array}{@{}rcl@{}} U_{\mu}&=&\left(\frac{a\,-\,1}{\sigma}\right)\,\sum_{i=1}^{n}\,\left[\frac{\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) }{1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)}\right] \,-\,\frac{1}{\sigma}\,\sum_{i=1}^{n}\,\left[\frac{\phi^{\prime}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)}{\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)}\right]\\ &&-\frac{a\,(b-1)}{\sigma}\, \sum_{i=1}^{n}\,\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1} }{ 1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} }\right\}\\ \end{array} $$

and

$$\begin{array}{@{}rcl@{}} U_{\sigma}&=&\frac n\sigma\,-\,\frac{\mu (a-1)}{\sigma^{2}}\, \sum_{i=1}^{n}\,\left[\frac{\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) }{1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)}\right] \,+\,\frac{\mu}{\sigma^{2}}\,\sum_{i=1}^{n}\,\left[\frac{\phi^{\prime}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)} {\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)}\right]\\ &+&\frac{a \mu (b-1)}{\sigma^{2}}\, \sum_{i=1}^{n}\,\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1} }{ 1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} }\right\}. \end{array} $$

An advantage of the EN distribution is that the MLE $\widehat {b}$ has a partially closed-form expression. Suppose that the observed information matrix is non-negative definite. The MLE of b can be expressed in terms of the MLEs $\widehat {a},\widehat {\mu }$ and $\widehat {\sigma }$ as

$$\begin{array}{@{}rcl@{}} \widehat{b}&=&\varphi(\widehat{a},\widehat{\mu},\widehat{\sigma},\left\{x_{1},\,\ldots,x_{n}\right\})\\ &=&\left(\, n^{-1}\,\sum_{i=1}^{n}\,\log\left\{1\,-\,\left[\,1\,-\,\Phi\left(\frac{x_{i}\,-\,\widehat{\mu}}{\widehat{\sigma}}\right)\,\right]^{\widehat{a}}\right\} \, \right)^{-1}. \end{array} $$

(11)

This fact is important at least for two reasons. The estimates become the solutions of a system with three equations and three variables (say “(3,3) system”) instead of a (4,4) system. Further, Eq. (11) clarifies the relationship of $\widehat {b}$ with $\widehat {a}$, $\widehat {\mu }$ and $\widehat {\sigma }$. More details are described in the simulation section.

Additionally, in order to make inference on the model parameters, the total observed information matrix is J(θ)={−U_rs}, where U_rs=∂² ℓ(θ)/∂θ_r ∂θ_s, for r,s∈{a,b,μ,σ}. By differentiating the score function, we obtain the Hessian matrix elements U_rs given in Appendix B.

The EN regression model

The classical normal linear regression model is usually applied in science and engineering to describe symmetrical data for which linear functions of unknown parameters are used to explain the phenomena under study. However, it is well-known that several phenomena are not always in agreement with the classical regression model due to lack of symmetry and/or the presence of heavy and lightly tails in the empirical distribution. As an alternative to overcome this shortcoming, we propose a new regression model based on the EN distribution thus extending the normal linear regression.

Let v_i=(v_i1,…,v_ip)^⊤ be the p×1 explanatory variable vector associated with the ith response variable x_i (for i=1,…,n). Let X_i be a response variable having the EN distribution given by (6) re-parameterized as

$$ X_{i}=\mathbf{v}_{i}^{\top} \,\boldsymbol{\beta}\,+\,\sigma\, Z_{i}, $$

(12)

where the random error Z∼EN(a,b,0,1) has the standardized EN distribution, β=(β₁,…,β_p)^⊤ is the unknown vector of coefficients, σ>0 is an unknown dispersion parameter and v_i is the explanatory vector modeling the location parameter $\mu _{i}=\mathbf {v}_{i}^{\top } \boldsymbol {\beta }$.

Hence, the location parameter vector μ=(μ₁,…,μ_n)^⊤ of the EN regression model has the linear structure μ=Vβ, where V=[v₁|…|v_n]^⊤ is a known model matrix.

The EN regression model (12) opens new possibilities for fitting many different types of data, since the EN distribution is much more flexible then the normal distribution. The most important special regressions are:

For a=1, it gives the exponentiated-normal (Exp-N) regression model, which has not been explored, but it can be understood as a regression under the power normal distribution pioneered by Kundu and Gupta (2013).
For b=1, it reduces to the LTII-normal (LTII-N) regression model defined as a linear model under the LTII-N distribution.
If a=b=1, it reduces to the normal linear regression.

For statistical inference on the EN regression model, we consider a sample (X₁,v₁),…,(X_n,v_n) of n independent observations. The log-likelihood function for the vector of parameters η=(a,b,σ,β^⊤)^⊤ of model (12) is

$$\begin{array}{@{}rcl@{}} \ell({\boldsymbol{\eta}}) &\,=\,& n\log\left(\frac{a\,b}{\sigma}\right)\,+\, \sum_{i=1}^{n}\log[\phi(z_{i})] \,+\, (a-1)\sum_{i=1}^{n}\log[1-\Phi(z_{i})]\\ &&\,+\, (b-1)\sum_{i=1}^{n}\log\{1-[1-\Phi(z_{i})]^{a}\}, \end{array} $$

(13)

where $z_{i}=({x_{i}-\mathbf {v}_{i}^{\top }\boldsymbol {\beta }})/\sigma $ and x_i is a possible outcome of X_i.

The components of the score vector U(η) are

$$\begin{array}{@{}rcl@{}} \frac{\partial l({\boldsymbol{\eta}})}{\partial a}&=&\frac{n}{a}+\sum_{i=1}^{n}\log[1-\Phi(z_{i})]-(b-1)\sum_{i=1}^{n}\frac{[1-\Phi(z_{i})]^{a} \log[1-\Phi(z_{i})]}{\{1-[1-\Phi(z_{i})]^{a}\}}, \end{array} $$

$$\begin{array}{@{}rcl@{}} \frac{\partial l({\boldsymbol{\eta}})}{\partial b}&=&\frac{n}{b}+\sum_{i=1}^{n}\log\{1-[1-\Phi(z_{i})]^{a}\}, \end{array} $$

$$\begin{array}{@{}rcl@{}} \frac{\partial l({\boldsymbol{\eta}})}{\partial\sigma}&=&-\frac{n}{\sigma}-\frac{2}{\sigma}\sum_{i=1}^{n}z_{i}^{2}+\frac{(a-1)}{\sigma} \sum_{i=1}^{n}\frac{z_{i}\phi(z_{i})}{[1-\Phi(z_{i})]}\\ &-&\frac{a\,(b-1)}{\sigma}\sum_{i=1}^{n}\frac {z_{i}\phi(z_{i})[1-\Phi(z_{i})]^{a-1}}{\{1-[1-\Phi(z_{i})]^{a}\}}, \end{array} $$

$$\begin{array}{@{}rcl@{}} \frac{\partial l({\boldsymbol{\eta}})}{\partial\beta_{j}}&=&-\frac{2}{\sigma} \sum_{i=1}^{n}v_{ij}z_{i}+\frac{(a-1)}{\sigma}\sum_{i=1}^{n} \frac{v_{ij}\phi(z_{i})}{[1-\Phi(z_{i})]}\\ &-&\frac{a\,(b-1)}{\sigma}\sum_{i=1}^{n}\frac{v_{ij}\phi(z_{i})[1-\Phi(z_{i})]^{a-1}} {\{1-[1-\Phi(z_{i})]^{a}\}}, \end{array} $$

where j=1,…,p.

Note that a closed-form expression for the MLE $\widehat {\boldsymbol {\eta }}$ is analytically intractable and, therefore, its computation has to be performed numerically by means of a nonlinear optimization algorithm.

We can maximize the log-likelihood function (13) based on the Newton-Raphson method. In particular, we use the matrix programming language Ox (MaxBFGS function) (see Doornik 2007) to calculate $\widehat {{\boldsymbol {\eta }}}$. Initial values for β and σ can be taken from the fit of the classical regression model (a=b=1).

Under general regularity conditions, the asymptotic distribution of $(\widehat {\boldsymbol {\eta }}-{\boldsymbol {\eta }})$ is multivariate normal N_p+3(0,K(η)⁻¹), where K(η) is the expected information matrix. These conditions can be found in Cox and Hinkley’s Theoretical Statistics book (1974). The asymptotic covariance matrix K(η)⁻¹ of $\widehat {{\boldsymbol {\eta }}}$ can be approximated by the inverse of the (p+3)×(p+3) observed information matrix J(η) and then the inference on the parameter vector η can be based on the normal approximation N_p+3(0,J(η)⁻¹) for $\widehat {{\boldsymbol {\eta }}}$.

Besides estimation of the model parameters, hypotheses tests can be considered using likelihood ratio (LR) statistics.

Numerical results

Three studies are presented in this section. First, we perform a Monte Carlo simulation study. Subsequently, two applications to real data show the potential uses of the new distribution. Third, the usefulness of the proposed regression model in Section 6 is proved empirically based on quality of life data.

7.1 Simulation study

Here, we provide a Monte Carlo simulation study in order to quantify the effectiveness of the EN distribution based on the symmetrized Kullback-Leibler divergence as a goodness-of-fit comparison criterion.

Initially, we provide a brief discussion on the Kullback-Leibler divergence. According to Cover and Thomas (1991), this measure is the quantification of the error by assuming that the Y model is true when the data follow the X distribution. For example, it has been proposed as essential parts of test statistics and strongly applied to contexts of radar synthetic aperture image processing in both univariate (Nascimento et al. 2010) and polarimetric (or multivariate) (Nascimento et al. 2014) perspectives.

In order to work with measures which satisfy non-negativity, symmetry and definiteness properties, Nascimento et al. (2010) considered the measure d_KL, namely

$$\begin{array}{@{}rcl@{}} & d_{\text{KL}}(X,Y)\,=\,\frac 12\,[\,D(X||Y)\,+\,D(Y||X)\,] \\ \,&\,=\,\int_{\mathcal D}\, \underbrace{ (\,f_{X}(x;[a_{x},b_{x},\mu_{x},\sigma_{x}])\,-\,f_{Y}(x;[a_{y},b_{y},\mu_{y},\sigma_{y}])\,) \,\log\left(\frac{ f_{X}(x;[a_{x},b_{x},\mu_{x},\sigma_{x}]) }{ f_{Y}(x;[a_{y},b_{y},\mu_{y},\sigma_{y}]) } \right) }_{ \equiv \,\text{IntegrandKL(x,y)} } \mathrm{d}x. \end{array} $$

Figure 3 displays both functions IntegrandKL(x,y) and d_KL(X,Y) at the parametric point [a,b,μ,σ]=[a,b,0,1] when a,b=4,5,6. It is noticeable that this measure can be understood as a distance between the two points– θ₁=(a₁,b₁,μ₂,σ₁) and θ₂=(a₂,b₂,μ₂,σ₂)–in the parametric space, say d_KL(θ₁,θ₂).

For increasing values of ε, the IntegrandKL (X,Y) has different forms. Further, IntegrandKL (X,Y)→0 when ε→0.

Figure 3b and c reveal the influence of a and b, respectively, when we employ a perturbation in each parameter under (μ,σ)=(0,1). As expected, when the value of ε increases, the distance d _KL also increases in both cases. However, this distance is most evident when we take smaller negative values of ε.

Table 1 gives the asymptotic performance of the maximum likelihood procedure discussed in the previous section with respect to the Kullback-Leibler distance, where we identify critical scenarios under the parametric space, which can require a harder maximum likelihood estimation. The results support the fact: “when we wish to estimate one additional parameter (a or b) given that the MLE for the other parameter is known and higher than one, then the biases of the estimates tend to increase for high values of the parameter of interest.” In particular, at the MLE of b given $\widehat {a}$, the above information finds strong justification in Eq. (11). Based on this equation, when $\widehat {a}$ takes high values, the MLE of b collapses for an indetermination algebraic.

Table 1 The KL distance between fitted and theoretical densities for n=100 and different values for a and b

Full size table

7.2 Two applications to real data

Here, we perform two applications to real data sets. First, we consider the data the strengths of glass fibres analyzed by Jones and Faddy (2004). These data were obtained at the National Physical Laboratory (UK) to explain the breaking strength of sixty three glass fibres having length 1.5 cm.

As a second application, we consider the fatigue life data (Meeker and Escobar 1998) for sixty seven specimens of Alloy T7987 that failed before having accumulated three hundred thousand cycles of testing. The data set was rounded to the nearest thousand cycles.

We prove empirically the efficiency of the EN distribution versus the normal, skew-normal (SN) (Azzalini 1984) and beta normal (BN) (Eugene et al. 2002) distributions.

The SN density [ T∼SN(a,μ,σ)] has the form (for $x,\,a,\,\mu \in \mathbb {R}$ and σ>0)

$$f(x;a,\mu,\sigma)\,=\,\frac 2\sigma\, \phi\left(\frac{x\,-\,\mu}{\sigma}\right)\, \Phi\left[a\,\left(\frac{x\,-\,\mu}{\sigma}\right)\right] $$

and the BN density [ T∼BN(α,β,μ,σ)] is (for $x,\,\mu \in \mathbb {R}$ and α,β,σ>0)

$$f(x;\alpha,\beta,\mu,\sigma) \,=\,J\, \phi\left(\frac{x\,-\,\mu}{\sigma}\right)\, \left[\Phi\left(\frac{x\,-\,\mu}{\sigma}\right)\right]^{\alpha-1}\, \left[1\,-\,\Phi\left(\frac{x\,-\,\mu}{\sigma}\right)\right]^{\beta-1}, $$

where J=Γ(α+β)/[Γ(α) Γ(β) σ].

We compare the distributions using three goodness-of-fit (GoF) measures: Anderson-Darling (A ^∗), Cramer-Von Mises (W ^∗) and Kolmogorov-Smirnov (KS) statistics. We adopt the goodness.fit function from the R program through the BFGS method. According to detailed discussion in Quang (1989), these measures are more indicated than the Akaike information criterion (AIC) and Bayesian information criterion (BIC) or some of their variations, which are more useful for nested models. Table 2 gives the GoF measures for each fitted distribution with respect to both data sets.

Table 2 The GoF measures of the fitted EN, SN, normal (N) and beta normal (BN) distributions to two real data sets

Full size table

The GoF’s measures for the EN distribution correspond to the lowest values among the discrimination criteria (highlighted in Table 2). These results provide evidence that the EN distribution is the most suitable model (among those considered) to describe both data sets.

7.3 Application for regression models

We assess changes on the oral health-related quality of life (OHRQL) of schoolchildren. To that end, a follow-up exam of three years was made to evaluate the impact of caries incidence on the OHRQL of adolescents. The data were obtained from a study (for more details, see Paula et al. 2012) developed by the Department of Community Dentistry, Division of Health Education and Health Promotion, Piracicaba Dental School, University of Campinas-UNICAMP.

The variables employed are (for i=1,…,291):

x_i: overall score of the OHRQL at time of follow up;
v_i1: number of teeth decayed, missing and filled (TDMF)

(0=without TDMF increment; 1=with TDMF increment).

We analyze these data based on the EN regression model

$$X_{i}\,=\,\beta_{0}\,+\,\beta_{1}\,v_{i1}\,+\,\sigma\,Z_{i},\quad i=1,\ldots,291, $$

where the errors Z_i’s are independent random variables having the EN (a,b,0,1) distribution.

The gamma-normal (GN) (Lima et al. 2015) distribution extends the normal distribution and can be used to fit data that come from a distribution with heavy tails reducing the influence of aberrant observations. The GN density with location parameter $\mu \in \mathbb {R}$, dispersion parameter σ>0 and shape parameter a>0 takes the form

$$\begin{array}{@{}rcl@{}} f(x)=\frac{1}{\sigma\Gamma(a)}\,\phi\left(\frac{x-\mu}{\sigma}\right) \left\{-\log\left[1-\Phi\left(\frac{x-\mu}{\sigma}\right)\right]\right\}^{a-1}. \end{array} $$

Further, the EN regression model is compared with the Exp-N, LTII-N, normal and GN regression models. Table 3 provides the MLEs of the parameters for the EN regression and these models.

Table 3 MLEs, SEs in (·) and p-values between [·] for the EN, Exp-N, LTII-N, normal and GN regressions fitted to the OHRQL data

Full size table

Iterative maximization of the log-likelihood function (13) starts with initial values for β and σ taken from the fit of the classical regression model (a=b=1). In general, all fitted regression models reveal that v₁ is significant at a 1% level of significance and that there is a significant difference between the levels of the numbers of teeth decayed, missing and filled. As expected, we find reciprocal relations between $\mu _{i}=\mathbb {E}(X_{i})$ and v_1i in the EN, LTII-N, GN and normal regression models, except for the Exp-N regression (which-although well adjusted-does not seem to be a coherent model). On the other hand, based on the estimates of σ, the EN regression model reveals advantages in relation to the other models.

The values of the AIC, Consistent Akaike Information Criterion (CAIC) and BIC to compare the fitted models are given in Table 4.

Table 4 AIC, CAIC and BIC Statistics

Full size table

It is clear that the EN regression model outperforms the other regressions irrespective of the criteria and then we can conclude that the new regression model can be used effectively in the analysis of the current data set. A comparison of the proposed regression model with some of its sub-models using LR statistics is addressed in Table 5.

Table 5 LR statistics

Full size table

The figures in this table, specially the p-values, indicate that the EN regression model yields a better fit to these data than the other sub-models.

A graphical comparison among the fitted regression models is reported in Figure 4. The plots of these curves are the empirical cdf and the estimated cdf. Based on these plots, it is evident that the EN regression model provides a superior fit to the current data.

Conclusions

Flexible statistical distributions have been sought for describing data from practical situations in which the use of classical ones is not recommended. In this paper, we propose an extension of the normal distribution based on the exponentiated generalized family defined by Cordeiro et al. (2011), which adds two extra shape parameters to a baseline distribution. We provide some structural properties of the new extended normal (EN) distribution. The model parameters are estimated by maximum likelihood. The efficiency of this distribution is illustrated by means of two applications to real data sets. There is a clear evidence that the EN distribution outperforms the skew-normal distribution and can be a competitive alternative to the beta normal distribution. The classical regression model does not produce good results in many real problems, and for this reason several extensions have arisen in recent years. We propose a new regression model based on the EN distribution and prove its importance in real applications. This new regression model opens a wide range of research topics following the basic inference concepts of the normal linear regression model.

Appendix A: Proof for the Theorem 3.1

We consider the power series

$$\begin{array}{@{}rcl@{}} (1 - z)^{b} \,=\, \sum_{k=0}^{\infty}(-1)^{k}\,{b \choose k}\,z^{k}, \end{array} $$

which holds for any real non-integer b and |z|<1. Using this generalized binomial expansion twice in Eq. (1), we can write the EG-G cumulative distribution as

$$\begin{array}{@{}rcl@{}} F(x)=\sum_{j=0}^{\infty} \,w_{j+1}\,H_{j+1}(x), \end{array} $$

where $w_{j+1}= \sum _{m=1}^{\infty } (-1)^{j+m+1}\,{b \choose m}\, {m\,a \choose j+1}$ and H_j+1(x) is the Exp-G cdf with power parameter j+1. By differentiating the last equation, we obtain (7).

Appendix B: The Hessian matrix

The elements of the Hessian matrix are:

$$\begin{array}{@{}rcl@{}} U_{aa}&=&-\frac{n}{a^{2}}\,-\,(b\,-\,1)\,\sum_{i=1}^{n}\, \left\{ \frac{ \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\log^{2}\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ 1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} } \right\}\\ &&+(b\,-\,1)\,\sum_{i=1}^{n}\,\left\{\,\frac{ \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{2a}\,\log^{2}\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ \left\{\,1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\right\} } \right\}, \end{array} $$

$$\begin{array}{@{}rcl@{}} {}U_{ab}&=&-\sum_{i=1}^{n}\,\left\{ \frac{ \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ 1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} } \right\}, \end{array} $$

$$\begin{array}{@{}rcl@{}} {\kern29pt}U_{a\mu}&=&\frac 1\sigma\,\sum_{i=1}^{n}\,\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) }{ 1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) }\right\}\\ &&-\frac{(b-1)}{\sigma}\sum_{i=1}^{n}\, \left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\, \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} }{ \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]\left\{\,1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\right\} } \right\}\\ &&-\frac{a\,(b-1)}{\sigma} \,\sum_{i=1}^{n}\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1} \,\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ 1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} } \right\}\\ &&+\frac{a\,(b-1)}{\sigma} \,\sum_{i=1}^{n}\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\, \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{2a-1} \,\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ \left\{\,1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\right\}^{2} } \right\}, \end{array} $$

$$\begin{array}{@{}rcl@{}} {\kern30pt}U_{a\sigma}&=&-\frac \mu{\sigma^{2}}\,\sum_{i=1}^{n}\,\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) }{ 1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right) }\right\}\\ &&+\frac{\mu\,(b-1)}{\sigma^{2}}\sum_{i=1}^{n}\, \left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\, \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} }{ \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]\left\{\,1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\right\} } \right\} \\ &&+\frac{a\,(b-1)\,\mu}{\sigma^{2}} \,\sum_{i=1}^{n}\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1} \,\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ 1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a} } \right\}\\ &&-\frac{a\,(b-1)\,\mu}{\sigma^{2}} \,\sum_{i=1}^{n}\left\{ \frac{ \phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\, \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{2a-1} \,\log\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right] }{ \left\{\,1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\right\}^{2} } \right\}, \end{array} $$

$$\begin{array}{@{}rcl@{}} {}U_{bb}&=&\frac{n}{b^{2}}, \end{array} $$

$$\begin{array}{@{}rcl@{}} U_{b\mu}&=&-\frac{a}{\sigma}\,\sum_{i=1}^{n}\,\left\{\frac{\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\, \left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1}} {1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}}\right\} \end{array} $$

$$\begin{array}{@{}rcl@{}} {}U_{b\sigma}&=&-\frac{a\mu}{\sigma^{2}}\,\sum_{i=1}^{n}\,\left\{\frac{\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1}} {1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}}\right\}, \end{array} $$

$$\begin{array}{@{}rcl@{}} U_{\mu\mu}&=&-\,\left(\frac{a-1}{\sigma^{2}}\right)\,\sum_{i=1}^{n}\,\left\{\frac{\phi^{\prime}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)} {1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)} \,+\,\frac{\phi^{2}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)} {\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{2}}\right\} \\ &&+\frac{1}{\sigma^{2}}\sum_{i=1}^{n}\,\left\{\frac{\phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\phi^{\prime\prime}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,-\,{\phi}^{\prime{2}}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)} {\phi^{2}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)}\right\} \\ &&-\frac{a(b-1)}{\sigma}\,\sum_{i=1}^{n} \left\{-\frac1\sigma\frac{\phi^{\prime}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-1}} {1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}} \right.\\ &&+\left(\frac{a-1}{\sigma}\right)\frac{\phi^{2}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a-2}} {1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}} \\ &&\left. -\frac{a}{\sigma} \frac{\phi^{2}\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{2a-2}} {\left\{\,1\,-\,\left[1\,-\,\Phi\left(\frac{x_{i}\,-\,\mu}{\sigma}\right)\right]^{a}\,\right\}^{2}}\right\}, \end{array} $$

$$U_{\mu\sigma}\,=\,-\frac {1}{\sigma}\,U_{\mu}\,-\,\frac \mu\sigma\,U_{\mu\mu}\quad \text{ and } \quad U_{\sigma\sigma}\,=\,-\,\frac{n}{\sigma^{2}}\,+\,\frac{\mu}{\sigma^{2}}\,U_{\mu}\,-\,\frac{\mu}{\sigma}\,U_{\mu\sigma}. $$

Availability of data and materials

Possible interested readers can contact authors.

Notes

Exton H. Handbook of hypergeometric integrals: theory, applications, tables, computer programs, 1978

References

Azzalini, A: A class of distributions which includes the normal ones. Scand. J. Stat. 12, 171–178 (1984).
MathSciNet MATH Google Scholar
Cintra, RJ, Cordeiro, GM, Nascimento, ADC: Beta generalized normal distribution with an application for SAR. Image Process. 48, 1–16 (2013).
Google Scholar
Cordeiro, GM, Cunha, DCC, Ortega, EMM: The exponentiated generalized class of distributions. J. Data Sci. 11, 777–803 (2013).
MathSciNet Google Scholar
Cordeiro, GM, Ortega, EMM, Silva, GO: The exponentiated generalized gamma distribution with application to lifetime data. J. Stat. Comput. Simul. 81, 827–842 (2011).
Article MathSciNet Google Scholar
Cover, TM, Thomas, JA, Ortega, EMM: Elements of Information Theory. Wiley-Interscience, New York (1991).
Book Google Scholar
Doornik, JA: An Object-Oriented Matrix Language Ox 5. Timberlake Consultants Press, London (2007).
Google Scholar
Eugene, N, Lee, C, Famoye, F: Beta-normal distribution and its applications. Commun. Stat.-Theory Methods. 31, 497–512 (2002).
Article MathSciNet Google Scholar
Frery, AC, Nascimento, ADC, Cintra, RJ: Analytic Expressions for Stochastic Distances Between Relaxed Complex Wishart Distributions. IEEE Trans. Geosci. Remote Sens. 52, 1213–1226 (2014).
Article Google Scholar
Jones, M, Faddy, MJ: A skew extension of the t-distribution, with applications. Biom. J. 65, 159–174 (2004).
MathSciNet MATH Google Scholar
Lee, C, Famoye, F, Alzaatreh, AY: Methods for generating families of univariate continuous distributions in the recent decades. Wiley Interdiscip. Rev. Comput. Stat. 5, 219–238 (2013).
Article Google Scholar
Lehmann, EL: The power of rank tests. Ann. Math. Statist. 24, 23–43 (1953).
Article MathSciNet Google Scholar
Lima, MCS, Cordeiro, GM, Ortega, EMM: A new extendion of the normal distribution. J. Data Sci. 3, 385–408 (2015).
Google Scholar
Meeker, WQ, Escobar, L: Statistical Methods for Reliability Data. Wiley, New York (1998).
MATH Google Scholar
Mudholkar, GS, Srivastava, DK: Exponentiated Weibull family for analyzing bathtub failure-real data. IEEE Trans. Reliab. 42, 299–302 (1993).
Article Google Scholar
Nadarajah, S: Explicit expressions for moments of order statistics. Statistics and Probability Letters. 78, 196–205 (2008).
Article MathSciNet Google Scholar
Nascimento, ADC, Cintra, RJ, Frery, AC: Hypothesis Testing in Speckled Data with Stochastic Distances. IEEE Trans. Geosci. Remote Sens. 48, 373–385 (2010).
Article Google Scholar
Paula, JS, Oliveira, M, Soares, MSP, Chaves, MGAM, Mialhe, FL: Perfil Epidemiológico dos Pacientes Atendidos no Pronto Atendimento da Faculdade de Odontologia da Universidade Federal de Juiz de Fora. Arquivos em Odontologia (UFMG). 48, 257–262 (2012).
Google Scholar
Quang, HV: Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses. Econometrica. 57, 307–333 (1989).
Article MathSciNet Google Scholar
Rêgo, LC, Cintra, RJ, Cordeiro, GM: On some properties of the beta normal distribution. Commun. Stat. - Theory Methods. 41, 3722–3738 (2012).
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank the financial support of CNPq and FACEPE, Brazil.

Funding

Not applicable.

Author information

Maria C.S. Lima, Gauss M. Cordeiro, Edwin M.M. Ortega, and Abraão D.C. Nascimento contributed equally to this work.

Authors and Affiliations

Departamento de Estatística, Universidade Federal de Pernambuco, Recife, PE 50740-540, Brazil
Maria C.S. Lima & Gauss M. Cordeiro
Departamento de Ciências Exatas, ESALQ, Universidade de São Paulo, Piracicaba/SP, Brazil
Edwin M.M. Ortega
Departamento de Estatística, Universidade Federal de Pernambuco, Recife, PE 50740-540, Brazil
Abraão D.C. Nascimento

Authors

Maria C.S. Lima
View author publications
You can also search for this author in PubMed Google Scholar
Gauss M. Cordeiro
View author publications
You can also search for this author in PubMed Google Scholar
Edwin M.M. Ortega
View author publications
You can also search for this author in PubMed Google Scholar
Abraão D.C. Nascimento
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors, viz MCSL, GMC, EMMO and ADCN with the consultation of each other carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Maria C.S. Lima.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Lima, M.C., Cordeiro, G.M., Ortega, E.M. et al. A new extended normal regression model: simulations and applications. J Stat Distrib App 6, 7 (2019). https://doi.org/10.1186/s40488-019-0098-y

Download citation

Received: 30 January 2019
Accepted: 28 May 2019
Published: 08 June 2019
DOI: https://doi.org/10.1186/s40488-019-0098-y

A new extended normal regression model: simulations and applications

Abstract

Introduction

The EN distribution