The cumulative distribution function (CDF) of the T-X family is given by R{W(F(x))}, where R is the CDF of a random variable T, F is the CDF of X and W is an increasing function defined on [0, 1] having the support of T as its range. This family provides a new method of generating univariate distributions. Different choices of the R, F and W functions naturally lead to different families of distributions. This paper proposes the use of quantile functions to define the W function. Some general properties of this T-X system of distributions are studied. It is shown that several existing methods of generating univariate continuous distributions can be derived using this T-X system. Three new distributions of the T-X family are derived, namely, the normal-Weibull based on the quantile of Cauchy distribution, normal-Weibull based on the quantile of logistic distribution, and Weibull-uniform based on the quantile of log-logistic distribution. Two real data sets are applied to illustrate the flexibility of the distributions.

1. Introduction

Statistical distributions are important for parametric inferences and applications to fit real world phenomena. Many methods have been developed to generate statistical distributions in the literature. Some well-known methods in the early days for generating univariate continuous distributions include methods based on differential equations developed by Pearson (1895), methods of translation developed by Johnson (1949), and the methods based on quantile functions developed by Tukey (1960). The interest in developing new methods for generating new or more flexible distributions continues to be active in the recent decades. Lee et al. (2013) indicated that the majority of methods developed after 1980s are the methods of ‘combination’ for the reason that these new methods are based on the idea of combining two existing distributions or by adding extra parameters to an existing distribution to generate a new family of distributions. A brief summary of some methods in the literature that are related to the method proposed in this article is provided.

McDonald (1984) introduced the generalized beta distributions of the first and second kinds (GB1 and GB2). Subsequently, a further generalization named the generalized beta distribution (GBD) was given by McDonald and Xu (1995), which consists of more than 30 special cases or limiting distributions of GBD, including GB1 and GB2.

Azzalini (1985) introduced a family of skew-normal distributions, SN (λ), defined as g(x; λ) = 2ϕ(x)Φ(λx), where ϕ and Φ are probability density function (PDF) and cumulative distribution function (CDF) of N(0, 1), respectively. The skewness is characterized by the parameter λ. For a review of skew-symmetric distributions, one may refer to Kotz and Vicari (2005). Ferreira and Steel (2006) introduced a general framework for generating a family of skewed distributions based on a symmetric distribution. The PDF of the new family has the form

where \overline{\mathit{\alpha}}=1-\mathit{\alpha} and \overline{\mathit{F}}=1-\mathit{F} is the survival function of the random variable X. For details about life distributions, one may refer to Marshall and Olkin (2010) and Lai (2013).

Eugene et al. (2002) proposed the beta-generated family of distributions, where beta distribution with PDF b is used as the generator. The CDF of the beta generated distribution is defined as \mathit{G}\left(\mathit{x}\right)={\displaystyle {\int}_{0}^{\mathit{F}\left(\mathit{x}\right)}\mathit{b}\left(\mathit{t}\right)\phantom{\rule{0.12em}{0ex}}\mathit{dt}}, where F is the CDF of any random variable. If X is continuous, the corresponding PDF of the beta generated distribution is

where B(α, β) is the beta function. The PDF in (1.3) can be considered as a generalization of the distribution of order statistic (Eugene et al. 2002; Jones 2004). Many researchers have studied the beta generated distributions and their applications by applying different F in Equation (1.3). Examples include Famoye et al. (2004), Akinsete et al. (2008), Cordeiro and Lemonte (2011), and Alshawarbeh et al. (2012).

Jones (2009) and Cordeiro and de Castro (2011) extended the beta-generated family of distributions by using Kumaraswamy distribution b(t) = αβ t^{α - 1}(1 - t^{α})^{β - 1}, t∈ (0, 1) (Kumaraswamy 1980), instead of the beta distribution. The PDF for Kumaraswamy-generated (Kw-G) family of distributions is defined by

Some examples of the Kw-G distributions are the Kw-Weibull (Cordeiro et al. 2010) and Kw-Gumbel (Cordeiro et al. 2011).

Recently, Alexander et al. (2012) studied the generalized beta-X family by considering b(t) = cB(a, b)^{- 1}t^{ac - 1}(1 - t^{c})^{b - 1}, 0 < t < 1, the generalized beta distribution of the first kind introduced by McDonald (1984). The new family is called generalized beta-generated (GBG) family of distributions. The PDF for GBG family of distributions is given by

When c = 1, the family in (1.5) reduces to the beta-X family in (1.3), and when a = 1, (1.5) reduces to the Kw-G family in (1.4).

Alzaatreh et al. (2013b) proposed a general method by replacing the beta PDF with a PDF r of a continuous random variable and applying a function W(F(x)) that satisfies some conditions (given in (2.1)) to develop the T-X family. The CDF of the T-X family is defined as

Different W functions generate different families of T-X distributions. Two continuous distributions of the T-X families that have been studied are Gamma-Pareto distribution (Alzaatreh et al. 2012a) and Weibull-Pareto distribution (Alzaatreh et al. 2013a). When X is discrete, the resulting T-X family is discrete. The T-geometric family generates the discrete analogue to the distribution of any continuous random variable T (Alzaatreh et al. 2012b). For a review of methods for generating univariate continuous distributions, one may refer to Lee et al. (2013).

The T-X family provides a new method to generate distributions by using the function W. A large number of distributions, continuous and discrete, can be generated by applying any two existing univariate distributions based on this method. Alzaatreh et al. (2013b) gave several choices of W(λ), including - log(1 - λ), λ/(1 - λ), log(λ/(1 - λ)), log(-log λ). It is clear that there are other choices that can be defined to generate different T-X families. Is there a systematic approach to define the W function for the T-X family? This question will be addressed in this paper.

In Section 2, a method to define the W function for generating T-X families of continuous probability distributions is presented. The W functions defined in Alzaatreh et al. (2013b) are special cases of the general approach. In order to distinguish between the previous T-X family proposed by Alzaatreh et al. (2013b) and the method proposed in this article, we use the abbreviation T-X(W) family for the previous T-X family and the abbreviation T-X{Y} for the family defined in Section 2. In Section 3, some properties of the new families are studied. Relationship between the new families and some existing families is given. Also in Section 3, the normal-Weibull distribution based on the quantile function of the Cauchy distribution, the normal-Weibull distribution based on the quantile function of the logistic distribution and Weibull-uniform distribution based on the quantile function of the log-logistic distribution are defined. Some properties of these three distributions are derived. In Section 4, a general family of life distributions based on survival function using similar methodology of the T-X{Y} family is presented. Some properties of the family are investigated. In Section 5, two real data sets are used to illustrate the flexibility of T-X{Y} family of distributions. Conclusions are given in Section 6.

2. Generating families of continuous probability distributions using quantile function

The T-X(W) family of distributions in (1.6) is generated by using the function W which satisfies the following conditions (Alzaatreh et al. 2013b):

where [a, b] is the support of the random variable T for - ∞ ≤ a < b ≤ ∞.

In this section, a class of W functions wider than the one defined in (2.1) will be considered to define a T-X family. Let W : (0, 1) → (a, b), for - ∞ ≤ a < b ≤ ∞, be a right-continuous and non-decreasing function such that, \underset{\mathit{\lambda}\to {0}^{+}}{\text{lim}}\mathit{W}\left(\mathit{\lambda}\right)=\mathit{a} and \underset{\mathit{\lambda}\to {1}^{-}}{\text{lim}}\mathit{W}\left(\mathit{\lambda}\right)=\mathit{b}, then the composition G(x) = R{W(F(x))}, x∈ (-∞, ∞), is a distribution function, because it satisfies the following required conditions for a distribution function:

Note that if both functions W and F are absolutely continuous, then G in (2.2) is absolutely continuous and has a density function \mathit{g}\left(\mathit{x}\right)={\scriptscriptstyle \frac{\mathit{d}}{\mathit{dx}}}\mathit{G}\left(\mathit{x}\right).

A general method to define W function for generating T-X families is now proposed. It is assumed that the random variable T has support on the interval (a, b). Let P be the CDF of the random variable Y taking values on (a, b), and define the quantile function of the distribution P by

If P is continuous and strictly increasing then Q_{
Y
} = P^{- 1} is continuous and strictly increasing (Shorack and Wellner 1986). We take W to be the quantile function of a strictly increasing distribution function P for the random variable Y, namely, W(λ) = Q_{
Y
}(λ), λ∈ (0, 1), then Q_{
Y
} is continuous and non-decreasing, and the CDF of a T-X{Y} family using the quantile function Q_{
Y
} is defined as

If we assume further that Y has a density p(y) > 0 for all y in a neighborhood of Q_{
Y
}(λ) where λ∈ (0, 1), then {\scriptscriptstyle \frac{\mathit{d}}{\mathit{d}\mathit{\lambda}}}{\mathit{Q}}_{\mathit{Y}}\left(\mathit{\lambda}\right) exists and equals [p(Q_{
Y
}(λ))]^{- 1} (Shorack and Wellner 1986), and hence the corresponding PDF associated with (2.3) is

Note that the PDF defined in (2.4) can be easily used to generate a T-X{Y} family of distributions by applying the quantile function of any existing distribution.

The notation X sometimes represents the random variable with PDF f and sometimes represents the random variable with PDF g. Where there may be confusion, the notations X_{
f
} for the random variable X with PDF f and X_{
g
} for the random variable X with PDF g are used. The term moment refers to non-central moment, unless otherwise specified.

Lemma 1:

(a)

If the random variables X_{
f
} and Y have the same distribution with the same parameters, then G = R.

(b)

If the random variables T and Y have the same distribution with the same parameters, then G = F.

Proof: The proofs of (a) and (b) follow from definition (2.4). □

Some properties of the T-X{Y} family:

1.

Any PDF f can be represented as the PDF defined in (2.4) by considering Q_{
Y
} = R^{- 1}.

2.

The support of the new random variable defined in (2.4) is the same as the support of the random variable with PDF f.

3.

If the support of the random variable Y is [c, d] with [a, b] ⊂ [c, d], then the PDF in (2.4) is defined with support [F^{- 1}(P(a)), F^{- 1}(P(b))].

4.

The relationship between the random variable X_{
g
} with PDF in (2.4) and T is given by T = Q_{
Y
}(F(X_{
g
})) and hence, X_{
g
} = F^{- 1}(P(T)) when F^{- 1} exists, where P is the CDF of Y with the corresponding quantile function Q_{
Y
}. Using this relation, one can generate the random variable X_{
g
} by generating the random variable T and then computing X_{
g
} = F^{- 1}(P(T)). Similarly, one can compute the moments of X_{
g
} by using \mathit{E}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right)=\mathit{E}\left\{{\left[{\mathit{F}}^{-1}\left(\mathit{P}\left(\mathit{T}\right)\right)\right]}^{\mathit{n}}\right\}.

5.

The hazard function, {\mathit{h}}_{\mathit{g}}\left(\mathit{x}\right)=\mathit{g}\left(\mathit{x}\right)/\overline{\mathit{G}}\left(\mathit{x}\right), for the random variable X_{
g
} in (2.4) is given by {\mathit{h}}_{\mathit{g}}\left(\mathit{x}\right)=\frac{\mathit{f}\left(\mathit{x}\right)}{\mathit{p}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{F}\left(\mathit{x}\right)\right)\right\}}{\mathit{h}}_{\mathit{r}}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{F}\left(\mathit{x}\right)\right)\right\}, where h_{
r
} is the hazard function for the random variable T with PDF r.

6.

The quantile function of the new random variable in (2.4) is given by {\mathit{Q}}_{{\mathit{X}}_{\mathit{g}}}\left(\mathit{\lambda}\right)={\mathit{Q}}_{{\mathit{X}}_{\mathit{f}}}\left\{\mathit{P}\left({\mathit{Q}}_{\mathit{T}}\left(\mathit{\lambda}\right)\right)\right\}, λ∈ (0, 1), where {\mathit{Q}}_{{\mathit{X}}_{\mathit{f}}} and Q_{
T
} are the quantile functions of the random variables X_{
f
} and T respectively.

3. Some T-X{Y} families and properties

3.1 Some T-X{Y} families based on different quantile functions

The quantile function of a random variable Y may not be explicitly represented. However, many of the existing continuous random variables are one-to-one functions and they have explicit quantile functions. The quantile functions of these random variables can be used to generate new T-X{Y} families. The following example illustrates how to derive the T-X{Y} family of distributions.

Example:T-X{log-logistic}family:

Let the random variable Y follow the log-logistic distribution with parameters α and β. The PDF and quantile function are, respectively, p(y) = (β/α)(y/α)^{β - 1}/(1 + (y/α)^{β})^{2}, y ≥ 0, and Q_{
Y
}(λ) = α(λ/(1 - λ))^{1/β}, λ∈ (0, 1). Therefore, p(Q_{
Y
}(λ)) = (β/α)λ^{(β - 1)/β}(1 - λ)^{(β + 1)/β}, and the definition in (2.4) gives the PDF of T-X{log-logistic} family as

When α = β = 1, the family in (3.1) reduces to \mathit{g}\left(\mathit{x}\right)=\frac{\mathit{f}\left(\mathit{x}\right)}{{\left(1-\mathit{F}\left(\mathit{x}\right)\right)}^{2}}\mathit{r}\left\{\mathit{F}\left(\mathit{x}\right)/\left(1-\mathit{F}\left(\mathit{x}\right)\right)\right\}. This PDF can be written in terms of hazard and survival functions of X_{
f
} as \mathit{g}\left(\mathit{x}\right)=\frac{{\mathit{h}}_{\mathit{f}}\left(\mathit{x}\right)}{\overline{\mathit{F}}\left(\mathit{x}\right)}\mathit{r}\left\{\left(1-\overline{\mathit{F}}\left(\mathit{x}\right))/\overline{\mathit{F}}(\mathit{x}\right)\right\}, where h_{
f
} is the hazard function and \overline{\mathit{F}} is the survival function of the random variable X_{
f
}.

Table 1 lists the probability density functions of some T-X{Y} families based on different quantile functions. Each family g is based on a given quantile function that defines many subfamilies of distributions.

Common supports of random variables X_{
f
} and T are [0, 1], (0, ∞), or (-∞, ∞). Beta-X, Kw-G and GBG are T-X{uniform} families with T being defined on [0, 1]. The W functions given in Alzaatreh et al. (2013b) can be defined by the quantile functions of random variable Y as follows: W(λ) = - log(1 - λ), λ∈ (0, 1), is the quantile function of standard exponential distribution, W(λ) = λ/(1 - λ) is the quantile function of log-logistic distribution with parameters α = β = 1, and W(λ) = log(λ/(1 - λ)) is the quantile function of logistic distribution with scale parameter b = 1 and location parameter a = 0. Many other W functions can be defined by using the quantile function approach. The T-X(W) families defined in Alzaatreh et al. (2013b) derive their parameters from the random variables T and X and none from the W function. The T-X(W) can be derived through the T-X{Y} framework by noting that the W function is the quantile function for the random variable Y. One advantage of using the T-X{Y} framework is that one can keep one or more parameters from the distribution of Y. In particular, keeping a shape parameter from Y can add more flexibility to the new distribution.

3.2 Some properties of T-X{Y} families

In the following, we assume (if necessary) the mentioned expectations exist and are finite.

Theorem 1: Let X_{
f
} be a non-negative random variable with PDF f(x), and let \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right) denote the n^{th} moment of X_{
f
}, then

where \mathit{E}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) is the n^{th} moment of the random variable with density in (2.4), \overline{\mathit{P}}=1-\mathit{P} is the survival function of the CDF P, and T is the random variable with PDF r.

The entropy of a random variable is a measure of the variation of uncertainty. Shannon’s (1948) entropy of the random variable X with density g is defined as E{-log(g(X))}.

Theorem 2: Shannon’s entropy {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}} of X_{
g
} is given by

where η_{
T
} is Shannon’s entropy for the random variable T with PDF r, and {\mathit{q}}_{{\mathit{X}}_{\mathit{f}}} is the quantile density function of X_{
f
}.

3.3 Relationship between T-X{Y} family and some existing families of distributions

Many existing families of distributions can be generated by using the quantile function approach defined in (2.4). Four examples are given in the following.

Generalized beta-generated (GBG) family introduced by Alexander et al. (2012):

The GBG family is the generalized beta-X{uniform} family. It can also be derived as follows: By setting α = β = 1 and p = 1/c in the Dagum quantile function in Table 1, the PDF of the T-X{Dagum} family is given by

By taking r in (3.3) to be r(t) = {B(a, b)}^{- 1}t^{a - 1}(1 + t)^{- b - a}, t > 0, which is the PDF of the inverted beta random variable, the family of probability density functions is obtained as

The family in (3.4) is the generalized beta-generated (GBG) family in (1.5).

Family of skewed distributions defined in Ferreira and Steel (2006):

The family of skewed distributions in (1.1) defined by Ferreira and Steel (2006) can be represented in the form of T-X{Y} system by considering the quantile function of a standard uniform distribution, Q_{
Y
}(λ) = λ, where F is the CDF of a symmetric PDF f and the r is a skewed PDF having support [0, 1].

Alzaatreh et al. (2013b) studied the T-X(W) family using W(F(x)) = - log(1 - F(x)). By using the quantile function approach, let λ = F(x), then W(λ) = - log (1 - λ) is the quantile function of standard exponential distribution. Hence, the T-X family studied by Alzaatreh et al. (2013b) is the T-X{exponential}. According to the authors, the T-X{exponential} is a family of distributions arising from the hazard function of X_{
f
}. If h_{
f
} is the hazard function and H_{
f
} is the cumulative hazard function of the random variable X_{
f
}, and the exponential distribution has mean b, the PDF of the T-X{exponential} family is g(x) = bh_{
f
}(x)r{bH_{
f
}(x)}. In a similar way, T-X{Weibull} and T-X{Rayleigh} families can be considered as families of distributions arising from hazard functions of X_{
f
}. The PDF of the T-X{Weibull} family is g(x) = (γ/c)h_{
f
} (x)r{γ(H_{
f
} (x))^{1/c}}/(H_{
f
} (x))^{(c - 1)/c} and the PDF of T-X{Rayleigh} is g(x) = bh_{
f
} (x)r{(2b^{2}H_{
f
} (x))^{1/2}}/(2H_{
f
} (x))^{1/2}.

Generalized beta distribution introduced by McDonald and Xu (1995):

Setting α = β = 1 in the PDF of the T-X{log-logistic} family in (3.1) and taking r(t) = {B(p, q)}^{- 1}t^{p - 1}(1 + t)^{- p - q}, t > 0, the PDF of the inverted beta random variable with parameters p and q, yields

Note that (3.5) is the beta-generated family. By taking F(x) = (x/b)^{a}/(1 + c(x/b)^{a}), which is the CDF of a truncated log-logistic distribution with 0 < x^{a} < b^{a}/(1 - c), 0 ≤ c < 1, a and b positive in (3.5), the generalized beta distribution introduced by McDonald and Xu (1995) is obtained as

Taking F(x) in (3.5) to be F(x) = e^{(x - δ)/σ}/(1 + ce^{(x - δ)/σ}), which is the CDF of a truncated logistic distribution with - ∞ < (x - δ)/σ < ln(1/(1 - c)), 0 ≤ c < 1 and σ > 0, yields

The PDF in (3.6) is the exponential generalized beta distribution in McDonald and Xu (1995).

3.4 Three examples of new distributions derived from T-X{Y} family

Table 1 contains many T-X families based on different quantile functions. Three new distributions, normal-Weibull{Cauchy}, normal-Weibull{logistic} and Weibull-uniform{log-logistic} distributions are introduced, and some properties of these distributions are studied.

Normal-Weibull{Cauchy} distribution:

Setting a = 0 and b = 1 in the T-X{Cauchy} family in Table 1, and letting r be N(μ, σ^{2}), the normal-X{Cauchy} sub-family is given by

for x > 0, and σ, c, γ > 0. The random variable with the PDF in (3.8) is said to follow a four-parameter normal-Weibull{Cauchy} (NW{C}) distribution. A location parameter δ can be included in (3.8) by writing x as (x - δ) leading to a five-parameter distribution.

Plots of the NW{C} density function for different parameter values are given in Figure 1. The graphs in Figure 1 show that the NW{C} distribution can be right skewed, left skewed, unimodal or bimodal.

Lemma 2: The n^{th} moment of the NW{C} random variable with PDF in (3.8) exists for any μ, σ > 0, c > 0, γ > 0 and satisfies the inequality

where Φ is the CDF of a normal distribution with parameters μ and σ.

Proof: The n^{th} moment for Weibull random variable is \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right)={\mathit{\gamma}}^{\mathit{n}}\phantom{\rule{0.12em}{0ex}}\mathit{\Gamma}\left(1+\mathit{n}/\mathit{c}\right). The CDF of standard Cauchy distribution is P(y) = 1/2 + (1/π)tan^{- 1}(y), so 1 - P(T) = 1/2 - (1/π)tan^{- 1}(T), where - ∞ < T < ∞. When T ≤ 1, 1 - P(T) ≥ 1/4 and hence (1 - P(T))^{- 1} ≤ 4. When T > 1 and by using the series {\text{tan}}^{-1}\left(\mathit{T}\right)=\frac{\mathit{\pi}}{2}+{\displaystyle {\sum}_{\mathit{n}=1}^{\infty}\frac{{\left(-1\right)}^{\mathit{n}}}{\left(2\mathit{n}-1\right){\mathit{T}}^{2\mathit{n}-1}}} (Polyanin and Manzhirov 2008),

Setting a = 0 and b = 1 in the PDF of the T-X{logistic} family in Table 1, and taking r to be N(μ, σ), the PDF of normal distribution, and F to be the CDF of Weibull distribution, F(x) = 1 - exp{-(x/γ)^{c}}, the normal-Weibull{logistic} (NW{L}) distribution is obtained as

Plots of NW{L} density function for different parameter values are given in Figure 2. The graphs in Figure 2 show that the NW{L} distribution can be reversed J-shape, skewed to the right or skewed to the left or bimodal.

Lemma 3: The n^{th} moment of the NW{L} random variable exists for any σ > 0, c > 0, γ > 0 and satisfies the inequality

Proof: The n^{th} moment for the Weibull random variable is \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right)={\mathit{\gamma}}^{\mathit{n}}\mathrm{\Gamma}\left(1+\mathit{n}/\mathit{c}\right). The CDF of standard logistic random variable is P(y) = exp(y)/{1 + exp(y)}. Since the random variable T has normal distribution with parameters μ and σ, then E({1 - P(T)}^{- 1}) = E(1 + exp(T)) = 1 + exp(μ + 0.5σ^{2}). By using Theorem 1, the result in (3.10) follows. □

Weibull-uniform{log-logistic} distribution:

Setting α = β = 1 in the PDF of the T-X{log-logistic} family in (3.1), and taking r to be the PDF of Weibull distribution, r(t) = (c/γ)(x/γ)^{c - 1} exp{-(x/γ)^{c}} and F to be the CDF of uniform distribution, F(x) = (x - a)/(b - a), the Weibull-uniform{log-logistic} (WU{LL}) distribution is obtained as

Plots of WU{LL} density function for different parameter values are given in Figure 3. The graphs in Figure 3 show that the WU{LL} distribution can be reversed J-shape, skewed to the right or skewed to the left or bimodal.

Lemma 4: The n^{th} moment of the WU{LL} random variable exists for any b > a, c > 0, γ > 0 and satisfies the inequality

Proof: The n^{th} moment for the uniform random variable is \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right)=\frac{{\mathit{b}}^{\mathit{n}+1}-{\mathit{a}}^{\mathit{n}+1}}{\left(\mathit{n}+1\right)\left(\mathit{b}-\mathit{a}\right)}. The CDF of standard log-logistic random variable is P(y) = y/(1 + y). Since the random variable T has Weibull distribution with parameters c and γ, then E({1 - P(T)}^{- 1}) = E(1 + T) = 1 + γ Γ(1 + 1/c). By using Theorem 1, the result in (3.11) follows. □

4. The family of T-X{Y} distributions based on survival functions

Instead of using the CDF F in (2.2), one can use the survival function \overline{\mathit{F}} and apply similar method to generate a new family of distributions in terms of survival functions.

If \overline{\mathit{P}} and Q_{
Y
} are the survival and quantile functions of the random variable Y, then {\overline{\mathit{P}}}^{-1}\left(\mathit{F}\left(\mathit{x}\right)\right)={\mathit{Q}}_{\mathit{Y}}\left(1-\mathit{F}\left(\mathit{x}\right)\right)={\mathit{Q}}_{\mathit{Y}}\left(\overline{\mathit{F}}\left(\mathit{x}\right)\right). A new family T-X{Y} of distributions in terms of the survival function of X is defined as

The family of life distributions introduced by Marshall and Olkin (1997) can be derived using (4.1) as follows. By using Q_{
Y
}(λ) = a(λ/(1 - λ))^{1/b}, the quantile function of the log-logistic distribution and R(t) = ηt/(1 + ηt), the CDF of log-logistic distribution with scale parameter 1/η in (4.1) can be written as

Letting ηa = α and 1/b = β, (4.3) becomes \overline{\mathit{G}}\left(\mathit{x}\right)=\frac{\mathit{\alpha}{\overline{\mathit{F}}}^{\mathit{\beta}}\left(\mathit{x}\right)}{{\mathit{F}}^{\mathit{\beta}}\left(\mathit{x}\right)+\mathit{\alpha}{\overline{\mathit{F}}}^{\mathit{\beta}}\left(\mathit{x}\right)}, which reduces to Marshall-Olkin’s family in (1.2) when β = 1.

The following theorem gives the relation between the moments of the random variables defined in (2.4) and (4.2) when the PDF f is symmetric.

Theorem 3: Let {\mathit{E}}_{1}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) and {\mathit{E}}_{2}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) denote the n^{th} moments of the random variables in (2.4) and (4.2) respectively. If f is symmetric, then

where m_{
f
} is the median of the random variable X_{
f
} with PDF f.

Proof: The n^{th} moment of (2.4) and (4.2) are {\mathit{E}}_{1}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right)={\displaystyle {\int}_{0}^{1}\frac{{\left\{{\mathit{F}}^{-1}\left(\mathit{v}\right)\right\}}^{\mathit{n}}}{\mathit{p}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{v}\right)\right\}}\mathit{r}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{v}\right)\right\}}\phantom{\rule{0.12em}{0ex}}\mathit{dv} and {\mathit{E}}_{2}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right)={\displaystyle {\int}_{0}^{1}\frac{{\left\{{\mathit{F}}^{-1}\left(1-\mathit{u}\right)\right\}}^{\mathit{n}}}{\mathit{p}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{u}\right)\right\}}\mathit{r}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{u}\right)\right\}}\phantom{\rule{0.12em}{0ex}}\mathit{du}, where the substitutions v = F(x) and u = 1 - F(x) are applied to (2.4) and (4.2) respectively. When f is symmetric, F^{- 1}(1 - u) = 2m_{
f
} - F^{- 1}(u), u∈ [0, 1]. By using the binomial theorem,

Theorem 4: Let X_{
f
} be non-negative random variable with PDF f, and let \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right) denote the n^{th} moment of X_{
f
}, then

where \mathit{E}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) is the n^{th} moment of the random variable with density in (4.2), and T is the random variable with PDF r.

Proof: The proof is similar to the proof of Theorem 1 after noting that the relation between the random variables X_{
g
} in (4.2) and T is given by {\mathit{X}}_{\mathit{g}}={\mathit{F}}^{-1}\left(\overline{\mathit{P}}\left(\mathit{T}\right)\right). □

Similar to Theorem 2, Shannon’s entropy {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}} for random variable X_{
g
} with PDF in (4.2) is given in the following theorem.

Theorem 5: Let X_{
g
} be a random variable with density in (4.2). Shannon’s entropy {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}} is given by {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}}=\mathit{E}\left\{\text{log}{\mathit{q}}_{{\mathit{X}}_{\mathit{f}}}\left(\overline{\mathit{P}}\left(\mathit{T}\right)\right)\right\}+\mathit{E}\left\{\text{log}\mathit{p}\left(\mathit{T}\right)\right\}+{\mathit{\eta}}_{\mathit{T}}.

Proof: The proof is similar to that of Theorem 2. □

5. Application

In this section, we apply the NW{C} distribution to fit two data sets. The first data is the famous Old Faithful Geyser eruption data (n = 272) obtained from Härdle (1991, p. 201). The data is the duration time of eruption (in minutes) taken during August 1^{st} to August 15^{th}, 1985 (Dekking et al. 2005). The second data set is USS Halfbeak diesel engine data (n = 71) studied by Ascher and Feingold (1984, p. 75) and Meeker and Escobar (1998, p. 415). The data is the time of unscheduled maintenance actions for the USS Halfbeak number 4 main propulsion diesel engine over 25.518 operating hours (Meeker and Escobar 1998, p. 415).

5.1 The famous old faithful Geyser eruption data

As shown in Figure 4, the data has two distinct modes. A common approach for fitting such a bimodal data is by using mixture distributions. Arellano-Valle et al. (2010) applied flexible epsilon-skew-normal distribution to fit the data and their fit is the same as that of mixture-normal distribution. Four distributions, a four-parameter NW{C} in (3.8), a five-parameter NW{C}, mixture normal, and beta-normal are applied to fit the data using maximum likelihood technique. Table 2 contains the estimates, standard errors of the estimates, log-likelihood values, AIC, K-S test statistics and the corresponding p-values.

The results in Table 2 indicate that the five-parameter NW{C} provides the best fit followed by mixture normal based on all three measures, log-likelihood, AIC and K-S statistic. When bimodality is a population characteristic, it may be more appropriate to fit the data with one distribution, instead of fitting the data by the mixture of two distributions. The NW{C} distribution can fit well a wide variety of distribution shapes, including bimodal data such as Old Faithful Geyser eruption data. Figure 4 displays the estimated PDF of the distributions that provide adequate fit to the data.

From Figure 4, the addition of the fifth parameter, which is a measure of location, has some effect on the fit. By using either a likelihood ratio test or the Wald test for the significance of the parameter δ, we observe that the parameter is significantly different from zero. Thus, a five-parameter NW{C} (and not a four-parameter NW{C}) should be used to fit the bivariate data. According to Johnson et al. (1994, p. 12), four-parameter distributions should be sufficient for most practical purposes. The authors went on to state and we quote “… but it is doubtful whether the improvement obtained by including a fifth or sixth parameter is commensurate with the extra labor involved”. For this application, adding the fifth parameter to the NW{C} improves the fit with an increase of more than 22 points in the log-likelihood value. Furthermore, the fifth parameter δ is significantly different from zero.

5.2 USS Halfbeak diesel engine data

Marciano et al. (2012) used the data to illustrate the application of the McDonald-gamma distribution (Mc-ΓD). The distribution of the data is highly skewed to the left and platykurtic (skewness = -1.576 and kurtosis = 1.653). The MLEs (with corresponding standard errors in parentheses) of the parameters of NW{C} distribution and the statistics AIC, the log-likelihood value and K-S and the corresponding p-values are given in Table 3. The values of AIC, log-likelihood and K-S statistics for Mc-ΓD, Kumaraswamy-gamma distribution (Kw-ΓD) are taken from Marciano et al. (2012). The other results in Table 3 are obtained by using NLMIXED procedure in SAS and the MATLAB software.

The NW{C} distribution has the smallest AIC and K-S statistics, and the largest log-likelihood value, which indicates NW{C} is superior to the other distributions in Table 3. Figure 5 displays the estimated PDF of the NW{C}, Mc-Γ and Kw-Γ distributions. The figure shows that the NW{C} distribution provides the best fit to the data compared to other distributions.

6. Conclusions

This paper presents a method to generate the T-X(W) families of distributions introduced in Alzaatreh et al. (2013b) by defining the W function using the quantile function of another random variable Y. Table 1 contains some T-X{Y} families based on different quantile functions. The T-X{Y} framework provides an easy way for generating distributions of the T-X(W) family introduced by Alzaatreh et al. (2013b). Existing methods like the methods of combination reviewed in Lee et al. (2013) for generating univariate continuous distributions can be derived using the T-X{Y} framework. T-X{exponential}, T-X{Weibull} and T-X{Rayleigh} can be viewed as families of distributions arising from hazard functions. The T-X{Y} family is extended by using survival function of X. The family of life distributions derived by Marshall and Olkin (1997) can be derived using the T-X{Y} family based on survival functions. Three new distributions in the family, normal-Weibull{Cauchy}, normal-Weibull{logistic} and Weibull-uniform{log-logistic} distributions are defined. These distributions are very flexible and are capable of fitting various types of data. The Old Faithful Geyser eruption data are used to illustrate that the NW{C} distribution fits bimodal data very well, which typically can only be adequately fitted using mixture distributions.

References

Akinsete A, Famoye F, Lee C: The beta-Pareto distribution. Statistics 2008, 42: 547–563. 10.1080/02331880801983876

Alzaatreh A, Lee C, Famoye F: On the discrete analogues of continuous distributions. Statistical Methodology 2012b, 9: 589–603. 10.1016/j.stamet.2012.03.003

Alzaatreh A, Famoye F, Lee C: Weibull-Pareto distribution and its applications. Communications in Statistics-Theory and Methods 2013a, 42: 1673–1691. 10.1080/03610926.2011.599002

Arellano-Valle RB, Cortés MA, Gómez HW: An extension of the epsilon-skew-normal distribution. Communications in Statistics-Theory and Methods 2010, 39(5):912–922. 10.1080/03610920902807903

Cordeiro GM, Lemonte AJ: The β-Birnbaum-Saunders distribution: an improved distribution for fatigue life modeling. Computational Statistics and Data Analysis 2011, 55(3):1445–1461. 10.1016/j.csda.2010.10.007

Cordeiro GM, Ortega EMM, Nadarajah S: The Kumaraswamy Weibull distribution with application to failure data. J Franklin Inst 2010, 347: 1399–1429. 10.1016/j.jfranklin.2010.06.010

Eugene N, Lee C, Famoye F: The beta-normal distribution and its applications. Communications in Statistics-Theory and Methods 2002, 31(4):497–512. 10.1081/STA-120003130

Famoye F, Lee C, Eugene N: Beta-normal distribution: bimodality properties and applications. Journal of Modern Applied Statistical Methods 2004, 3(1):85–103.

Ferreira JTAS, Steel MFJ: A constructive representation of univariate skewed distributions. J Am Stat Assoc 2006, 101: 823–829. 10.1198/016214505000001212

Jones MC: Kumaraswamy’s distribution: a beta-type distribution with tractability advantages. Statistical Methodology 2009, 6: 70–81. 10.1016/j.stamet.2008.04.001

Lee C, Famoye F, Alzaatreh A: Methods for generating families of univariate continuous distributions in the recent decades. WIREs Computational Statistics 2013, 5: 219–238. 10.1002/wics.1255

Marciano FWP, Nascimento ADC, Santos-Neto M, Corderio GM: The Mc-Γ distribution and its statistical properties: an application to reliability data. International Journal of Statistics and Probability 2012, 1(1):53–71.

Marshall AW, Olkin I: A new method for adding a parameter to a family of distributions with applications to the exponential and Weibull families. Biometrika 1997, 84: 641–652. 10.1093/biomet/84.3.641

Pearson K: Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philos Trans R Soc Lond A 1895, 186: 343–414. 10.1098/rsta.1895.0010

Tukey JW Technical Report 36. In The Practical Relationship Between the Common Transformations of Percentages of Counts and Amounts. Princeton University, Princeton, NJ, Statistical Techniques Research Group; 1960.

We are grateful for many constructive comments and suggestions from the associate editor and the two referees. These comments and suggestions have greatly improved the presentation of the paper.

Author information

Authors and Affiliations

Department of Mathematics, Central Michigan University, Mt. Pleasant, MI, 48859, USA

The authors declare that they have no competing interests.

Authors’ contributions

The authors, viz MAA, CL and FF with the consultation of each other carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Aljarrah, M.A., Lee, C. & Famoye, F. On generating T-X family of distributions using quantile functions.
J Stat Distrib App1, 2 (2014). https://doi.org/10.1186/2195-5832-1-2