 Research
 Open access
 Published:
On generating TX family of distributions using quantile functions
Journal of Statistical Distributions and Applications volume 1, Article number: 2 (2014)
Abstract
The cumulative distribution function (CDF) of the TX family is given by R{W(F(x))}, where R is the CDF of a random variable T, F is the CDF of X and W is an increasing function defined on [0, 1] having the support of T as its range. This family provides a new method of generating univariate distributions. Different choices of the R, F and W functions naturally lead to different families of distributions. This paper proposes the use of quantile functions to define the W function. Some general properties of this TX system of distributions are studied. It is shown that several existing methods of generating univariate continuous distributions can be derived using this TX system. Three new distributions of the TX family are derived, namely, the normalWeibull based on the quantile of Cauchy distribution, normalWeibull based on the quantile of logistic distribution, and Weibulluniform based on the quantile of loglogistic distribution. Two real data sets are applied to illustrate the flexibility of the distributions.
1. Introduction
Statistical distributions are important for parametric inferences and applications to fit real world phenomena. Many methods have been developed to generate statistical distributions in the literature. Some wellknown methods in the early days for generating univariate continuous distributions include methods based on differential equations developed by Pearson (1895), methods of translation developed by Johnson (1949), and the methods based on quantile functions developed by Tukey (1960). The interest in developing new methods for generating new or more flexible distributions continues to be active in the recent decades. Lee et al. (2013) indicated that the majority of methods developed after 1980s are the methods of ‘combination’ for the reason that these new methods are based on the idea of combining two existing distributions or by adding extra parameters to an existing distribution to generate a new family of distributions. A brief summary of some methods in the literature that are related to the method proposed in this article is provided.
McDonald (1984) introduced the generalized beta distributions of the first and second kinds (GB1 and GB2). Subsequently, a further generalization named the generalized beta distribution (GBD) was given by McDonald and Xu (1995), which consists of more than 30 special cases or limiting distributions of GBD, including GB1 and GB2.
Azzalini (1985) introduced a family of skewnormal distributions, SN (λ), defined as g(x; λ) = 2ϕ(x)Φ(λx), where ϕ and Φ are probability density function (PDF) and cumulative distribution function (CDF) of N(0, 1), respectively. The skewness is characterized by the parameter λ. For a review of skewsymmetric distributions, one may refer to Kotz and Vicari (2005). Ferreira and Steel (2006) introduced a general framework for generating a family of skewed distributions based on a symmetric distribution. The PDF of the new family has the form
where F is the CDF of a symmetric PDF f and p is a skewed PDF defined on [0, 1].
Marshall and Olkin (1997) proposed a general method for generating a new family of life distributions defined in terms of survival function as
where \overline{\mathit{\alpha}}=1\mathit{\alpha} and \overline{\mathit{F}}=1\mathit{F} is the survival function of the random variable X. For details about life distributions, one may refer to Marshall and Olkin (2010) and Lai (2013).
Eugene et al. (2002) proposed the betagenerated family of distributions, where beta distribution with PDF b is used as the generator. The CDF of the beta generated distribution is defined as \mathit{G}\left(\mathit{x}\right)={\displaystyle {\int}_{0}^{\mathit{F}\left(\mathit{x}\right)}\mathit{b}\left(\mathit{t}\right)\phantom{\rule{0.12em}{0ex}}\mathit{dt}}, where F is the CDF of any random variable. If X is continuous, the corresponding PDF of the beta generated distribution is
where B(α, β) is the beta function. The PDF in (1.3) can be considered as a generalization of the distribution of order statistic (Eugene et al. 2002; Jones 2004). Many researchers have studied the beta generated distributions and their applications by applying different F in Equation (1.3). Examples include Famoye et al. (2004), Akinsete et al. (2008), Cordeiro and Lemonte (2011), and Alshawarbeh et al. (2012).
Jones (2009) and Cordeiro and de Castro (2011) extended the betagenerated family of distributions by using Kumaraswamy distribution b(t) = αβ t^{α  1}(1  t^{α})^{β  1}, t ∈ (0, 1) (Kumaraswamy 1980), instead of the beta distribution. The PDF for Kumaraswamygenerated (KwG) family of distributions is defined by
Some examples of the KwG distributions are the KwWeibull (Cordeiro et al. 2010) and KwGumbel (Cordeiro et al. 2011).
Recently, Alexander et al. (2012) studied the generalized betaX family by considering b(t) = cB(a, b)^{ 1}t^{ac  1}(1  t^{c})^{b  1}, 0 < t < 1, the generalized beta distribution of the first kind introduced by McDonald (1984). The new family is called generalized betagenerated (GBG) family of distributions. The PDF for GBG family of distributions is given by
When c = 1, the family in (1.5) reduces to the betaX family in (1.3), and when a = 1, (1.5) reduces to the KwG family in (1.4).
Alzaatreh et al. (2013b) proposed a general method by replacing the beta PDF with a PDF r of a continuous random variable and applying a function W(F(x)) that satisfies some conditions (given in (2.1)) to develop the TX family. The CDF of the TX family is defined as
where R is the CDF of T. The corresponding PDF (if it exists) of the TX family of distributions is
Different W functions generate different families of TX distributions. Two continuous distributions of the TX families that have been studied are GammaPareto distribution (Alzaatreh et al. 2012a) and WeibullPareto distribution (Alzaatreh et al. 2013a). When X is discrete, the resulting TX family is discrete. The Tgeometric family generates the discrete analogue to the distribution of any continuous random variable T (Alzaatreh et al. 2012b). For a review of methods for generating univariate continuous distributions, one may refer to Lee et al. (2013).
The TX family provides a new method to generate distributions by using the function W. A large number of distributions, continuous and discrete, can be generated by applying any two existing univariate distributions based on this method. Alzaatreh et al. (2013b) gave several choices of W(λ), including  log(1  λ), λ/(1  λ), log(λ/(1  λ)), log(log λ). It is clear that there are other choices that can be defined to generate different TX families. Is there a systematic approach to define the W function for the TX family? This question will be addressed in this paper.
In Section 2, a method to define the W function for generating TX families of continuous probability distributions is presented. The W functions defined in Alzaatreh et al. (2013b) are special cases of the general approach. In order to distinguish between the previous TX family proposed by Alzaatreh et al. (2013b) and the method proposed in this article, we use the abbreviation TX(W) family for the previous TX family and the abbreviation TX{Y} for the family defined in Section 2. In Section 3, some properties of the new families are studied. Relationship between the new families and some existing families is given. Also in Section 3, the normalWeibull distribution based on the quantile function of the Cauchy distribution, the normalWeibull distribution based on the quantile function of the logistic distribution and Weibulluniform distribution based on the quantile function of the loglogistic distribution are defined. Some properties of these three distributions are derived. In Section 4, a general family of life distributions based on survival function using similar methodology of the TX{Y} family is presented. Some properties of the family are investigated. In Section 5, two real data sets are used to illustrate the flexibility of TX{Y} family of distributions. Conclusions are given in Section 6.
2. Generating families of continuous probability distributions using quantile function
The TX(W) family of distributions in (1.6) is generated by using the function W which satisfies the following conditions (Alzaatreh et al. 2013b):
where [a, b] is the support of the random variable T for  ∞ ≤ a < b ≤ ∞.
In this section, a class of W functions wider than the one defined in (2.1) will be considered to define a TX family. Let W : (0, 1) → (a, b), for  ∞ ≤ a < b ≤ ∞, be a rightcontinuous and nondecreasing function such that, \underset{\mathit{\lambda}\to {0}^{+}}{\text{lim}}\mathit{W}\left(\mathit{\lambda}\right)=\mathit{a} and \underset{\mathit{\lambda}\to {1}^{}}{\text{lim}}\mathit{W}\left(\mathit{\lambda}\right)=\mathit{b}, then the composition G(x) = R{W(F(x))}, x ∈ (∞, ∞), is a distribution function, because it satisfies the following required conditions for a distribution function:

(a)
G is nondecreasing,

(b)
G is rightcontinuous,

(c)
G(x) → 0 as x →  ∞ and G(x) → 1 as x → ∞.
If T has a PDF r with support (a, b), then
Note that if both functions W and F are absolutely continuous, then G in (2.2) is absolutely continuous and has a density function \mathit{g}\left(\mathit{x}\right)={\scriptscriptstyle \frac{\mathit{d}}{\mathit{dx}}}\mathit{G}\left(\mathit{x}\right).
A general method to define W function for generating TX families is now proposed. It is assumed that the random variable T has support on the interval (a, b). Let P be the CDF of the random variable Y taking values on (a, b), and define the quantile function of the distribution P by
If P is continuous and strictly increasing then Q_{ Y } = P^{ 1} is continuous and strictly increasing (Shorack and Wellner 1986). We take W to be the quantile function of a strictly increasing distribution function P for the random variable Y, namely, W(λ) = Q_{ Y }(λ), λ ∈ (0, 1), then Q_{ Y } is continuous and nondecreasing, and the CDF of a TX{Y} family using the quantile function Q_{ Y } is defined as
If we assume further that Y has a density p(y) > 0 for all y in a neighborhood of Q_{ Y }(λ) where λ ∈ (0, 1), then {\scriptscriptstyle \frac{\mathit{d}}{\mathit{d}\mathit{\lambda}}}{\mathit{Q}}_{\mathit{Y}}\left(\mathit{\lambda}\right) exists and equals [p(Q_{ Y }(λ))]^{ 1} (Shorack and Wellner 1986), and hence the corresponding PDF associated with (2.3) is
Note that the PDF defined in (2.4) can be easily used to generate a TX{Y} family of distributions by applying the quantile function of any existing distribution.
The notation X sometimes represents the random variable with PDF f and sometimes represents the random variable with PDF g. Where there may be confusion, the notations X_{ f } for the random variable X with PDF f and X_{ g } for the random variable X with PDF g are used. The term moment refers to noncentral moment, unless otherwise specified.
Lemma 1:

(a)
If the random variables X _{ f } and Y have the same distribution with the same parameters, then G = R.

(b)
If the random variables T and Y have the same distribution with the same parameters, then G = F.
Proof: The proofs of (a) and (b) follow from definition (2.4). □
Some properties of the TX{Y} family:

1.
Any PDF f can be represented as the PDF defined in (2.4) by considering Q _{ Y } = R ^{ 1}.

2.
The support of the new random variable defined in (2.4) is the same as the support of the random variable with PDF f.

3.
If the support of the random variable Y is [c, d] with [a, b] ⊂ [c, d], then the PDF in (2.4) is defined with support [F ^{ 1}(P(a)), F ^{ 1}(P(b))].

4.
The relationship between the random variable X _{ g } with PDF in (2.4) and T is given by T = Q _{ Y }(F(X _{ g })) and hence, X _{ g } = F ^{ 1}(P(T)) when F ^{ 1} exists, where P is the CDF of Y with the corresponding quantile function Q _{ Y }. Using this relation, one can generate the random variable X _{ g } by generating the random variable T and then computing X _{ g } = F ^{ 1}(P(T)). Similarly, one can compute the moments of X _{ g } by using \mathit{E}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right)=\mathit{E}\left\{{\left[{\mathit{F}}^{1}\left(\mathit{P}\left(\mathit{T}\right)\right)\right]}^{\mathit{n}}\right\}.

5.
The hazard function, {\mathit{h}}_{\mathit{g}}\left(\mathit{x}\right)=\mathit{g}\left(\mathit{x}\right)/\overline{\mathit{G}}\left(\mathit{x}\right), for the random variable X _{ g } in (2.4) is given by {\mathit{h}}_{\mathit{g}}\left(\mathit{x}\right)=\frac{\mathit{f}\left(\mathit{x}\right)}{\mathit{p}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{F}\left(\mathit{x}\right)\right)\right\}}{\mathit{h}}_{\mathit{r}}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{F}\left(\mathit{x}\right)\right)\right\}, where h _{ r } is the hazard function for the random variable T with PDF r.

6.
The quantile function of the new random variable in (2.4) is given by {\mathit{Q}}_{{\mathit{X}}_{\mathit{g}}}\left(\mathit{\lambda}\right)= {\mathit{Q}}_{{\mathit{X}}_{\mathit{f}}}\left\{\mathit{P}\left({\mathit{Q}}_{\mathit{T}}\left(\mathit{\lambda}\right)\right)\right\}, λ ∈ (0, 1), where {\mathit{Q}}_{{\mathit{X}}_{\mathit{f}}} and Q _{ T } are the quantile functions of the random variables X _{ f } and T respectively.
3. Some TX{Y} families and properties
3.1 Some TX{Y} families based on different quantile functions
The quantile function of a random variable Y may not be explicitly represented. However, many of the existing continuous random variables are onetoone functions and they have explicit quantile functions. The quantile functions of these random variables can be used to generate new TX{Y} families. The following example illustrates how to derive the TX{Y} family of distributions.
Example: TX {loglogistic} family:
Let the random variable Y follow the loglogistic distribution with parameters α and β. The PDF and quantile function are, respectively, p(y) = (β/α)(y/α)^{β  1}/(1 + (y/α)^{β})^{2}, y ≥ 0, and Q_{ Y }(λ) = α(λ/(1  λ))^{1/β}, λ ∈ (0, 1). Therefore, p(Q_{ Y }(λ)) = (β/α)λ^{(β  1)/β}(1  λ)^{(β + 1)/β}, and the definition in (2.4) gives the PDF of TX{loglogistic} family as
When α = β = 1, the family in (3.1) reduces to \mathit{g}\left(\mathit{x}\right)=\frac{\mathit{f}\left(\mathit{x}\right)}{{\left(1\mathit{F}\left(\mathit{x}\right)\right)}^{2}}\mathit{r}\left\{\mathit{F}\left(\mathit{x}\right)/\left(1\mathit{F}\left(\mathit{x}\right)\right)\right\}. This PDF can be written in terms of hazard and survival functions of X_{ f } as \mathit{g}\left(\mathit{x}\right)=\frac{{\mathit{h}}_{\mathit{f}}\left(\mathit{x}\right)}{\overline{\mathit{F}}\left(\mathit{x}\right)}\mathit{r}\left\{\left(1\overline{\mathit{F}}\left(\mathit{x}\right))/\overline{\mathit{F}}(\mathit{x}\right)\right\}, where h_{ f } is the hazard function and \overline{\mathit{F}} is the survival function of the random variable X_{ f }.
Table 1 lists the probability density functions of some TX{Y} families based on different quantile functions. Each family g is based on a given quantile function that defines many subfamilies of distributions.
Common supports of random variables X_{ f } and T are [0, 1], (0, ∞), or (∞, ∞). BetaX, KwG and GBG are TX{uniform} families with T being defined on [0, 1]. The W functions given in Alzaatreh et al. (2013b) can be defined by the quantile functions of random variable Y as follows: W(λ) =  log(1  λ), λ ∈ (0, 1), is the quantile function of standard exponential distribution, W(λ) = λ/(1  λ) is the quantile function of loglogistic distribution with parameters α = β = 1, and W(λ) = log(λ/(1  λ)) is the quantile function of logistic distribution with scale parameter b = 1 and location parameter a = 0. Many other W functions can be defined by using the quantile function approach. The TX(W) families defined in Alzaatreh et al. (2013b) derive their parameters from the random variables T and X and none from the W function. The TX(W) can be derived through the TX{Y} framework by noting that the W function is the quantile function for the random variable Y. One advantage of using the TX{Y} framework is that one can keep one or more parameters from the distribution of Y. In particular, keeping a shape parameter from Y can add more flexibility to the new distribution.
3.2 Some properties of TX{Y} families
In the following, we assume (if necessary) the mentioned expectations exist and are finite.
Theorem 1: Let X_{ f } be a nonnegative random variable with PDF f(x), and let \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right) denote the n^{th} moment of X_{ f }, then
where \mathit{E}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) is the n^{th} moment of the random variable with density in (2.4), \overline{\mathit{P}}=1\mathit{P} is the survival function of the CDF P, and T is the random variable with PDF r.
Proof: By definition,
Hence,
Using property (4) in Section 2 and (3.2) yields
The entropy of a random variable is a measure of the variation of uncertainty. Shannon’s (1948) entropy of the random variable X with density g is defined as E{log(g(X))}.
Theorem 2: Shannon’s entropy {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}} of X_{ g } is given by
where η_{ T } is Shannon’s entropy for the random variable T with PDF r, and {\mathit{q}}_{{\mathit{X}}_{\mathit{f}}} is the quantile density function of X_{ f }.
Proof: By definition,
Note that the random variable T = Q_{ Y }{F(X_{ g })} has the PDF r, and X_{ g } = F^{ 1}{P(T)}. Thus,
3.3 Relationship between TX{Y} family and some existing families of distributions
Many existing families of distributions can be generated by using the quantile function approach defined in (2.4). Four examples are given in the following.
Generalized betagenerated (GBG) family introduced by Alexander et al. (2012):
The GBG family is the generalized betaX{uniform} family. It can also be derived as follows: By setting α = β = 1 and p = 1/c in the Dagum quantile function in Table 1, the PDF of the TX{Dagum} family is given by
By taking r in (3.3) to be r(t) = {B(a, b)}^{ 1}t^{a  1}(1 + t)^{ b  a}, t > 0, which is the PDF of the inverted beta random variable, the family of probability density functions is obtained as
The family in (3.4) is the generalized betagenerated (GBG) family in (1.5).
Family of skewed distributions defined in Ferreira and Steel (2006):
The family of skewed distributions in (1.1) defined by Ferreira and Steel (2006) can be represented in the form of TX{Y} system by considering the quantile function of a standard uniform distribution, Q_{ Y }(λ) = λ, where F is the CDF of a symmetric PDF f and the r is a skewed PDF having support [0, 1].
TX family defined in Alzaatreh et al. (2013b):
Alzaatreh et al. (2013b) studied the TX(W) family using W(F(x)) =  log(1  F(x)). By using the quantile function approach, let λ = F(x), then W(λ) =  log (1  λ) is the quantile function of standard exponential distribution. Hence, the TX family studied by Alzaatreh et al. (2013b) is the TX{exponential}. According to the authors, the TX{exponential} is a family of distributions arising from the hazard function of X_{ f }. If h_{ f } is the hazard function and H_{ f } is the cumulative hazard function of the random variable X_{ f }, and the exponential distribution has mean b, the PDF of the TX{exponential} family is g(x) = bh_{ f }(x)r{bH_{ f }(x)}. In a similar way, TX{Weibull} and TX{Rayleigh} families can be considered as families of distributions arising from hazard functions of X_{ f }. The PDF of the TX{Weibull} family is g(x) = (γ/c)h_{ f } (x)r{γ(H_{ f } (x))^{1/c}}/(H_{ f } (x))^{(c  1)/c} and the PDF of TX{Rayleigh} is g(x) = bh_{ f } (x)r{(2b^{2}H_{ f } (x))^{1/2}}/(2H_{ f } (x))^{1/2}.
Generalized beta distribution introduced by McDonald and Xu (1995):
Setting α = β = 1 in the PDF of the TX{loglogistic} family in (3.1) and taking r(t) = {B(p, q)}^{ 1}t^{p  1}(1 + t)^{ p  q}, t > 0, the PDF of the inverted beta random variable with parameters p and q, yields
Note that (3.5) is the betagenerated family. By taking F(x) = (x/b)^{a}/(1 + c(x/b)^{a}), which is the CDF of a truncated loglogistic distribution with 0 < x^{a} < b^{a}/(1  c), 0 ≤ c < 1, a and b positive in (3.5), the generalized beta distribution introduced by McDonald and Xu (1995) is obtained as
with 0 ≤ c < 1, and a, b, p and q positive.
Taking F(x) in (3.5) to be F(x) = e^{(x  δ)/σ}/(1 + ce^{(x  δ)/σ}), which is the CDF of a truncated logistic distribution with  ∞ < (x  δ)/σ < ln(1/(1  c)), 0 ≤ c < 1 and σ > 0, yields
The PDF in (3.6) is the exponential generalized beta distribution in McDonald and Xu (1995).
3.4 Three examples of new distributions derived from TX{Y} family
Table 1 contains many TX families based on different quantile functions. Three new distributions, normalWeibull{Cauchy}, normalWeibull{logistic} and Weibulluniform{loglogistic} distributions are introduced, and some properties of these distributions are studied.
NormalWeibull{Cauchy} distribution:
Setting a = 0 and b = 1 in the TX{Cauchy} family in Table 1, and letting r be N(μ, σ^{2}), the normalX{Cauchy} subfamily is given by
Substituting F(x) = 1  exp{(x/γ)^{c}}, the CDF of Weibull distribution in (3.7) yields
for x > 0, and σ, c, γ > 0. The random variable with the PDF in (3.8) is said to follow a fourparameter normalWeibull{Cauchy} (NW{C}) distribution. A location parameter δ can be included in (3.8) by writing x as (x  δ) leading to a fiveparameter distribution.
Plots of the NW{C} density function for different parameter values are given in Figure 1. The graphs in Figure 1 show that the NW{C} distribution can be right skewed, left skewed, unimodal or bimodal.
Lemma 2: The n^{th} moment of the NW{C} random variable with PDF in (3.8) exists for any μ, σ > 0, c > 0, γ > 0 and satisfies the inequality
where Φ is the CDF of a normal distribution with parameters μ and σ.
Proof: The n^{th} moment for Weibull random variable is \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right)={\mathit{\gamma}}^{\mathit{n}}\phantom{\rule{0.12em}{0ex}}\mathit{\Gamma}\left(1+\mathit{n}/\mathit{c}\right). The CDF of standard Cauchy distribution is P(y) = 1/2 + (1/π)tan^{ 1}(y), so 1  P(T) = 1/2  (1/π)tan^{ 1}(T), where  ∞ < T < ∞. When T ≤ 1, 1  P(T) ≥ 1/4 and hence (1  P(T))^{ 1} ≤ 4. When T > 1 and by using the series {\text{tan}}^{1}\left(\mathit{T}\right)=\frac{\mathit{\pi}}{2}+{\displaystyle {\sum}_{\mathit{n}=1}^{\infty}\frac{{\left(1\right)}^{\mathit{n}}}{\left(2\mathit{n}1\right){\mathit{T}}^{2\mathit{n}1}}} (Polyanin and Manzhirov 2008),
Hence, (1  P(T))^{ 1} < (3π/2)T. Since the random variable T with PDF r has a normal distribution with parameters μ and σ, then by using Theorem 1
and the result in (3.9) follows. □
NormalWeibull{logistic} distribution:
Setting a = 0 and b = 1 in the PDF of the TX{logistic} family in Table 1, and taking r to be N(μ, σ), the PDF of normal distribution, and F to be the CDF of Weibull distribution, F(x) = 1  exp{(x/γ)^{c}}, the normalWeibull{logistic} (NW{L}) distribution is obtained as
Plots of NW{L} density function for different parameter values are given in Figure 2. The graphs in Figure 2 show that the NW{L} distribution can be reversed Jshape, skewed to the right or skewed to the left or bimodal.
Lemma 3: The n^{th} moment of the NW{L} random variable exists for any σ > 0, c > 0, γ > 0 and satisfies the inequality
Proof: The n^{th} moment for the Weibull random variable is \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right)={\mathit{\gamma}}^{\mathit{n}}\mathrm{\Gamma}\left(1+\mathit{n}/\mathit{c}\right). The CDF of standard logistic random variable is P(y) = exp(y)/{1 + exp(y)}. Since the random variable T has normal distribution with parameters μ and σ, then E({1  P(T)}^{ 1}) = E(1 + exp(T)) = 1 + exp(μ + 0.5σ^{2}). By using Theorem 1, the result in (3.10) follows. □
Weibulluniform{loglogistic} distribution:
Setting α = β = 1 in the PDF of the TX{loglogistic} family in (3.1), and taking r to be the PDF of Weibull distribution, r(t) = (c/γ)(x/γ)^{c  1} exp{(x/γ)^{c}} and F to be the CDF of uniform distribution, F(x) = (x  a)/(b  a), the Weibulluniform{loglogistic} (WU{LL}) distribution is obtained as
Plots of WU{LL} density function for different parameter values are given in Figure 3. The graphs in Figure 3 show that the WU{LL} distribution can be reversed Jshape, skewed to the right or skewed to the left or bimodal.
Lemma 4: The n^{th} moment of the WU{LL} random variable exists for any b > a, c > 0, γ > 0 and satisfies the inequality
Proof: The n^{th} moment for the uniform random variable is \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right)=\frac{{\mathit{b}}^{\mathit{n}+1}{\mathit{a}}^{\mathit{n}+1}}{\left(\mathit{n}+1\right)\left(\mathit{b}\mathit{a}\right)}. The CDF of standard loglogistic random variable is P(y) = y/(1 + y). Since the random variable T has Weibull distribution with parameters c and γ, then E({1  P(T)}^{ 1}) = E(1 + T) = 1 + γ Γ(1 + 1/c). By using Theorem 1, the result in (3.11) follows. □
4. The family of TX{Y} distributions based on survival functions
Instead of using the CDF F in (2.2), one can use the survival function \overline{\mathit{F}} and apply similar method to generate a new family of distributions in terms of survival functions.
If \overline{\mathit{P}} and Q_{ Y } are the survival and quantile functions of the random variable Y, then {\overline{\mathit{P}}}^{1}\left(\mathit{F}\left(\mathit{x}\right)\right)={\mathit{Q}}_{\mathit{Y}}\left(1\mathit{F}\left(\mathit{x}\right)\right)={\mathit{Q}}_{\mathit{Y}}\left(\overline{\mathit{F}}\left(\mathit{x}\right)\right). A new family TX{Y} of distributions in terms of the survival function of X is defined as
The corresponding PDF associated with (4.1) is
The family of life distributions introduced by Marshall and Olkin (1997) can be derived using (4.1) as follows. By using Q_{ Y }(λ) = a(λ/(1  λ))^{1/b}, the quantile function of the loglogistic distribution and R(t) = ηt/(1 + ηt), the CDF of loglogistic distribution with scale parameter 1/η in (4.1) can be written as
Letting ηa = α and 1/b = β, (4.3) becomes \overline{\mathit{G}}\left(\mathit{x}\right)=\frac{\mathit{\alpha}{\overline{\mathit{F}}}^{\mathit{\beta}}\left(\mathit{x}\right)}{{\mathit{F}}^{\mathit{\beta}}\left(\mathit{x}\right)+\mathit{\alpha}{\overline{\mathit{F}}}^{\mathit{\beta}}\left(\mathit{x}\right)}, which reduces to MarshallOlkin’s family in (1.2) when β = 1.
The following theorem gives the relation between the moments of the random variables defined in (2.4) and (4.2) when the PDF f is symmetric.
Theorem 3: Let {\mathit{E}}_{1}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) and {\mathit{E}}_{2}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) denote the n^{th} moments of the random variables in (2.4) and (4.2) respectively. If f is symmetric, then
where m_{ f } is the median of the random variable X_{ f } with PDF f.
Proof: The n^{th} moment of (2.4) and (4.2) are {\mathit{E}}_{1}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right)={\displaystyle {\int}_{0}^{1}\frac{{\left\{{\mathit{F}}^{1}\left(\mathit{v}\right)\right\}}^{\mathit{n}}}{\mathit{p}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{v}\right)\right\}}\mathit{r}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{v}\right)\right\}}\phantom{\rule{0.12em}{0ex}}\mathit{dv} and {\mathit{E}}_{2}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right)={\displaystyle {\int}_{0}^{1}\frac{{\left\{{\mathit{F}}^{1}\left(1\mathit{u}\right)\right\}}^{\mathit{n}}}{\mathit{p}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{u}\right)\right\}}\mathit{r}\left\{{\mathit{Q}}_{\mathit{Y}}\left(\mathit{u}\right)\right\}}\phantom{\rule{0.12em}{0ex}}\mathit{du}, where the substitutions v = F(x) and u = 1  F(x) are applied to (2.4) and (4.2) respectively. When f is symmetric, F^{ 1}(1  u) = 2m_{ f }  F^{ 1}(u), u ∈ [0, 1]. By using the binomial theorem,
Theorem 4: Let X_{ f } be nonnegative random variable with PDF f, and let \mathit{E}\left({\mathit{X}}_{\mathit{f}}^{\mathit{n}}\right) denote the n^{th} moment of X_{ f }, then
where \mathit{E}\left({\mathit{X}}_{\mathit{g}}^{\mathit{n}}\right) is the n^{th} moment of the random variable with density in (4.2), and T is the random variable with PDF r.
Proof: The proof is similar to the proof of Theorem 1 after noting that the relation between the random variables X_{ g } in (4.2) and T is given by {\mathit{X}}_{\mathit{g}}={\mathit{F}}^{1}\left(\overline{\mathit{P}}\left(\mathit{T}\right)\right). □
Similar to Theorem 2, Shannon’s entropy {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}} for random variable X_{ g } with PDF in (4.2) is given in the following theorem.
Theorem 5: Let X_{ g } be a random variable with density in (4.2). Shannon’s entropy {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}} is given by {\mathit{\eta}}_{{\mathit{X}}_{\mathit{g}}}=\mathit{E}\left\{\text{log}{\mathit{q}}_{{\mathit{X}}_{\mathit{f}}}\left(\overline{\mathit{P}}\left(\mathit{T}\right)\right)\right\}+\mathit{E}\left\{\text{log}\mathit{p}\left(\mathit{T}\right)\right\}+{\mathit{\eta}}_{\mathit{T}}.
Proof: The proof is similar to that of Theorem 2. □
5. Application
In this section, we apply the NW{C} distribution to fit two data sets. The first data is the famous Old Faithful Geyser eruption data (n = 272) obtained from Härdle (1991, p. 201). The data is the duration time of eruption (in minutes) taken during August 1^{st} to August 15^{th}, 1985 (Dekking et al. 2005). The second data set is USS Halfbeak diesel engine data (n = 71) studied by Ascher and Feingold (1984, p. 75) and Meeker and Escobar (1998, p. 415). The data is the time of unscheduled maintenance actions for the USS Halfbeak number 4 main propulsion diesel engine over 25.518 operating hours (Meeker and Escobar 1998, p. 415).
5.1 The famous old faithful Geyser eruption data
As shown in Figure 4, the data has two distinct modes. A common approach for fitting such a bimodal data is by using mixture distributions. ArellanoValle et al. (2010) applied flexible epsilonskewnormal distribution to fit the data and their fit is the same as that of mixturenormal distribution. Four distributions, a fourparameter NW{C} in (3.8), a fiveparameter NW{C}, mixture normal, and betanormal are applied to fit the data using maximum likelihood technique. Table 2 contains the estimates, standard errors of the estimates, loglikelihood values, AIC, KS test statistics and the corresponding pvalues.
The results in Table 2 indicate that the fiveparameter NW{C} provides the best fit followed by mixture normal based on all three measures, loglikelihood, AIC and KS statistic. When bimodality is a population characteristic, it may be more appropriate to fit the data with one distribution, instead of fitting the data by the mixture of two distributions. The NW{C} distribution can fit well a wide variety of distribution shapes, including bimodal data such as Old Faithful Geyser eruption data. Figure 4 displays the estimated PDF of the distributions that provide adequate fit to the data.
From Figure 4, the addition of the fifth parameter, which is a measure of location, has some effect on the fit. By using either a likelihood ratio test or the Wald test for the significance of the parameter δ, we observe that the parameter is significantly different from zero. Thus, a fiveparameter NW{C} (and not a fourparameter NW{C}) should be used to fit the bivariate data. According to Johnson et al. (1994, p. 12), fourparameter distributions should be sufficient for most practical purposes. The authors went on to state and we quote “… but it is doubtful whether the improvement obtained by including a fifth or sixth parameter is commensurate with the extra labor involved”. For this application, adding the fifth parameter to the NW{C} improves the fit with an increase of more than 22 points in the loglikelihood value. Furthermore, the fifth parameter δ is significantly different from zero.
5.2 USS Halfbeak diesel engine data
Marciano et al. (2012) used the data to illustrate the application of the McDonaldgamma distribution (McΓD). The distribution of the data is highly skewed to the left and platykurtic (skewness = 1.576 and kurtosis = 1.653). The MLEs (with corresponding standard errors in parentheses) of the parameters of NW{C} distribution and the statistics AIC, the loglikelihood value and KS and the corresponding pvalues are given in Table 3. The values of AIC, loglikelihood and KS statistics for McΓD, Kumaraswamygamma distribution (KwΓD) are taken from Marciano et al. (2012). The other results in Table 3 are obtained by using NLMIXED procedure in SAS and the MATLAB software.
The NW{C} distribution has the smallest AIC and KS statistics, and the largest loglikelihood value, which indicates NW{C} is superior to the other distributions in Table 3. Figure 5 displays the estimated PDF of the NW{C}, McΓ and KwΓ distributions. The figure shows that the NW{C} distribution provides the best fit to the data compared to other distributions.
6. Conclusions
This paper presents a method to generate the TX(W) families of distributions introduced in Alzaatreh et al. (2013b) by defining the W function using the quantile function of another random variable Y. Table 1 contains some TX{Y} families based on different quantile functions. The TX{Y} framework provides an easy way for generating distributions of the TX(W) family introduced by Alzaatreh et al. (2013b). Existing methods like the methods of combination reviewed in Lee et al. (2013) for generating univariate continuous distributions can be derived using the TX{Y} framework. TX{exponential}, TX{Weibull} and TX{Rayleigh} can be viewed as families of distributions arising from hazard functions. The TX{Y} family is extended by using survival function of X. The family of life distributions derived by Marshall and Olkin (1997) can be derived using the TX{Y} family based on survival functions. Three new distributions in the family, normalWeibull{Cauchy}, normalWeibull{logistic} and Weibulluniform{loglogistic} distributions are defined. These distributions are very flexible and are capable of fitting various types of data. The Old Faithful Geyser eruption data are used to illustrate that the NW{C} distribution fits bimodal data very well, which typically can only be adequately fitted using mixture distributions.
References
Akinsete A, Famoye F, Lee C: The betaPareto distribution. Statistics 2008, 42: 547–563. 10.1080/02331880801983876
Alexander C, Cordeiro GM, Ortega EMM, Sarabia JM: Generalized betagenerated distributions. Computational statistics and data analysis 2012, 56(6):1880–1896. 10.1016/j.csda.2011.11.015
Alshawarbeh A, Lee C, Famoye F: The betaCauchy distribution. Journal of Probability and Statistical Science 2012, 10: 41–58.
Alzaatreh A, Famoye F, Lee C: GammaPareto distribution and its applications. Journal of Modern Applied Statistical Methods 2012a, 11(1):78–94.
Alzaatreh A, Lee C, Famoye F: On the discrete analogues of continuous distributions. Statistical Methodology 2012b, 9: 589–603. 10.1016/j.stamet.2012.03.003
Alzaatreh A, Famoye F, Lee C: WeibullPareto distribution and its applications. Communications in StatisticsTheory and Methods 2013a, 42: 1673–1691. 10.1080/03610926.2011.599002
Alzaatreh A, Lee C, Famoye F: A new method for generating families of continuous distributions. Metron 2013b, 71(1):63–79. 10.1007/s403000130007y
ArellanoValle RB, Cortés MA, Gómez HW: An extension of the epsilonskewnormal distribution. Communications in StatisticsTheory and Methods 2010, 39(5):912–922. 10.1080/03610920902807903
Ascher H, Feingold H: Repairable Systems Reliability. Marcel Dekker, New York; 1984.
Azzalini A: A class of distributions which includes the normal ones. Scand J Stat 1985, 12: 171–178.
Cordeiro GM, de Castro M: A new family of generalized distributions. J Stat Comput Simul 2011, 81(7):883–898. 10.1080/00949650903530745
Cordeiro GM, Lemonte AJ: The βBirnbaumSaunders distribution: an improved distribution for fatigue life modeling. Computational Statistics and Data Analysis 2011, 55(3):1445–1461. 10.1016/j.csda.2010.10.007
Cordeiro GM, Ortega EMM, Nadarajah S: The Kumaraswamy Weibull distribution with application to failure data. J Franklin Inst 2010, 347: 1399–1429. 10.1016/j.jfranklin.2010.06.010
Cordeiro GM, Nadarajah S, Ortega EMM: The Kumaraswamy Gumbel distribution. Statistical Methods and Applications, 2012 2011, 21(2):139–168.
Dekking FM, Kraaikamp C, Lopuhaä HP, Meester LE: A Modern Introduction to Probability and Statistics. Springer, New York; 2005.
Eugene N, Lee C, Famoye F: The betanormal distribution and its applications. Communications in StatisticsTheory and Methods 2002, 31(4):497–512. 10.1081/STA120003130
Famoye F, Lee C, Eugene N: Betanormal distribution: bimodality properties and applications. Journal of Modern Applied Statistical Methods 2004, 3(1):85–103.
Ferreira JTAS, Steel MFJ: A constructive representation of univariate skewed distributions. J Am Stat Assoc 2006, 101: 823–829. 10.1198/016214505000001212
Härdle W: Smoothing Techniques with Implementation in S. Springer, New York; 1991.
Johnson NL: Systems of frequency curves generated by methods of translation. Biometrika 1949, 36: 149–176. 10.1093/biomet/36.12.149
Johnson NL, Kotz S, Balakrishnan N: Continuous Univariate Distributions, Vol. 1. 2nd edition. John Wiley and Sons, Inc., New York; 1994.
Jones MC: Families of distributions arising from distributions of order statistics. Test 2004, 13: 1–43. 10.1007/BF02602999
Jones MC: Kumaraswamy’s distribution: a betatype distribution with tractability advantages. Statistical Methodology 2009, 6: 70–81. 10.1016/j.stamet.2008.04.001
Kotz S, Vicari D: Survey of developments in the theory of continuous skewed distributions. Metron 2005, LXIII: 225–261.
Kumaraswamy P: A generalized probability density functions for doublebounded random processes. J Hydrol 1980, 46: 79–88. 10.1016/00221694(80)900360
Lai CD: Constructions and applications of lifetime distributions. Appl Stoch Model Bus Ind 2013, 29: 127–140. 10.1002/asmb.948
Lee C, Famoye F, Alzaatreh A: Methods for generating families of univariate continuous distributions in the recent decades. WIREs Computational Statistics 2013, 5: 219–238. 10.1002/wics.1255
Marciano FWP, Nascimento ADC, SantosNeto M, Corderio GM: The McΓ distribution and its statistical properties: an application to reliability data. International Journal of Statistics and Probability 2012, 1(1):53–71.
Marshall AW, Olkin I: A new method for adding a parameter to a family of distributions with applications to the exponential and Weibull families. Biometrika 1997, 84: 641–652. 10.1093/biomet/84.3.641
Marshall AW, Olkin I: Life Distributions. Springer, New York; 2010.
McDonald JB: Some generalized functions for the size distribution of income. Econometrica 1984, 52: 647–663. 10.2307/1913469
McDonald JB, Xu YJ: A generalization of the beta distribution with applications. J Econ 1995, 66: 133–152. 10.1016/03044076(94)016124
Meeker WQ, Escobar LA: Statistical methods for reliability data. John Wiley & Sons, New York; 1998.
Pearson K: Contributions to the mathematical theory of evolution. II. Skew variation in homogeneous material. Philos Trans R Soc Lond A 1895, 186: 343–414. 10.1098/rsta.1895.0010
Polyanin AD, Manzhirov A: Handbook of Integral Equations. 2nd edition. Chapman & Hall/CRC, New York; 2008.
Shannon CE: A mathematical theory of communication. Bell System Technical Journal 1948, 27: 379–432. 10.1002/j.15387305.1948.tb01338.x
Shorack GR, Wellner JA: Empirical Processes with Applications to Statistics. John Wiley & Sons, New York; 1986.
Tukey JW Technical Report 36. In The Practical Relationship Between the Common Transformations of Percentages of Counts and Amounts. Princeton University, Princeton, NJ, Statistical Techniques Research Group; 1960.
Acknowledgments
We are grateful for many constructive comments and suggestions from the associate editor and the two referees. These comments and suggestions have greatly improved the presentation of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
The authors, viz MAA, CL and FF with the consultation of each other carried out this work and drafted the manuscript together. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Aljarrah, M.A., Lee, C. & Famoye, F. On generating TX family of distributions using quantile functions. J Stat Distrib App 1, 2 (2014). https://doi.org/10.1186/2195583212
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/2195583212