Research  Open  Published:
Generalized loglogistic proportional hazard model with applications in survival analysis
Journal of Statistical Distributions and Applicationsvolume 3, Article number: 16 (2016)
Abstract
Proportional hazard (PH) models can be formulated with or without assuming a probability distribution for survival times. The former assumption leads to parametric models, whereas the latter leads to the semiparametric Cox model which is by far the most popular in survival analysis. However, a parametric model may lead to more efficient estimates than the Cox model under certain conditions. Only a few parametric models are closed under the PH assumption, the most common of which is the Weibull that accommodates only monotone hazard functions. We propose a generalization of the loglogistic distribution that belongs to the PH family. It has properties similar to those of loglogistic, and approaches the Weibull in the limit. These features enable it to handle both monotone and nonmonotone hazard functions. Application to four data sets and a simulation study revealed that the model could potentially be very useful in adequately describing different types of timetoevent data.
Introduction
Proportional hazard (PH) models play a vital role in analyzing timetoevent data. A key assumption in the PH model is that the hazard ratio comparing any two specifications of covariates is constant over time (commonly known as PH assumption). Although the PH assumption may not hold for one or more covariates over the entire study period, it may hold in shorter time intervals. Therefore, violation of the PH assumption may be handled using timedependent covariates (Kleinbaum and Klein 2012). One of the appealing features of PH models is that the regression coefficients have relative risk interpretation, which is preferred by many clinicians.
The Cox PH model (Cox 1972) is the most popular in survival analysis mainly because of two reasons: (a) no assumption is required about the probability distribution of survival times (i.e., a semiparametric model), and (b) it usually fits the data well no matter which parametric model is appropriate. In contrast, distributional assumption is required for a fully parametric PH model (Kalbfleisch and Prentice 2002; Lawless 2002). This also leads to the added requirement of checking the appropriateness of the chosen distribution. Nevertheless, as demonstrated by Efron (1977) and Oakes (1977), parametric models lead to more efficient estimates than Cox’s model under certain conditions. More specifically, if the distributional assumption is valid, a parametric model leads to smaller standard errors of the estimates than would be in the absence of a distributional assumption (Collett 2003). Moreover, the use of Cox PH in joint modeling of timetoevent and longitudinal data (Wulfsohn and Tsiatis 1997) usually leads to an underestimation of the standard errors of the parameter estimates (Hsieh et al. 2006; Rizopoulos 2012), and therefore most methods for joint modeling are based on parametric response distributions (Hwang and Pennell 2014). Regarding the choice between a parametric and Cox’s PH model, Nardi and Schemper (2003) suggested to use a richer parametric model or simply the Cox’s model in case of an unsatisfactory fit of the chosen probability distribution.
The most commonly used parametric timetoevent models are the Weibull, loglogistic and lognormal distributions. The loglogistic and lognormal distributions belong to the accelerated failure time (AFT) family, and are useful in modeling nonmonotone hazard rates (Lawless 2002). Note that the loglogistic also accommodates decreasing hazard functions. Only a few parametric models are closed under PH assumption, the most common of which is the Weibull that accommodates only monotone hazard functions. In fact, Weibull is the only distribution that is closed under both AFT and PH families (Kalbfleisch and Prentice 2002). Mudholkar et al. (1996) proposed a generalization of the Weibull distribution which permits parametric PH regression modeling. It is a threeparameter distribution and is capable of modeling both monotone and nonmonotone hazard functions. One difficulty with this model is that it is nonregular (the support depends on some parameters) in the case of increasing hazard functions, and therefore the standard maximum likelihood asymptotics do not hold. In this paper, we propose a simple extension of the loglogistic model which is closed under the PH relationship. The proposed generalized loglogistic model is a threeparameter distribution, and has characteristics similar to those of the loglogistic model. Moreover, it approaches the Weibull in the limit. These features enable it to satisfactorily handle both monotone (increasing and decreasing) and nonmonotone (unimodal) hazard functions. In Section 1, we introduce the generalized loglogistic model and discuss estimation and testing of the parameters using the maximum likelihood method. The proposed method is then illustrated with applications to four data sets, one of which involves joint modeling of timetoevent and longitudinal data (Section Examples). In Section Simulations, a simulation study is presented to evaluate the performance of generalized loglogistic in comparison with other commonly used PH models to describe different types of timetoevent data. We conclude in Section Conclusion by summarizing our findings.
The generalized loglogistic model
The generalized loglogistic distribution for a nonnegative random variable T can be conveniently specified in terms of the hazard function as follows:
where ρ>0, κ>0 and γ>0 are parameters and α=(κ,γ,ρ)^{′}. If γ depends on ρ via γ=ρ and γ=ρ η ^{−1/κ} with η>0, then (1) reduces to the hazard function of the loglogistic (Lawless 2002) and Burr XII (Wang et al. 2008) distributions, respectively. Taking γ not dependent on ρ, it is easy to verify that (1) is closed under PH relationship (see below). The hazard function is monotone decreasing when κ≤1, and unimodal when κ>1 (i.e., h(t;α)=0 at t=0, increases to a maximum at t=[(κ−1)/γ ^{κ}]^{1/κ}, and then approaches zero monotonically as t→∞). Note that (1) approaches the Weibull hazard function as γ ^{κ}→0. This particular feature of the generalized loglogistic model enables it to handle monotone increasing hazard satisfactorily via κ>1 and γ small (close to zero).
The survivor function, probability density function and cumulative hazard function of the generalized loglogistic distribution are, respectively,
The median of the distribution is $\frac {\left (2^{\frac {\gamma ^{\kappa }}{\rho ^{\kappa }}}1\right)^{\frac {1}{\kappa }}}{\gamma }$, and the r ^{th} moment is
In particular, the mean is $E(T)=\frac {\rho ^{\kappa }}{\gamma ^{\kappa }} \frac {\Gamma \left (\frac {\rho ^{\kappa }}{\gamma ^{\kappa }}\frac {1}{\kappa }\right) \Gamma \left (\frac {1}{\kappa }+1\right)}{\Gamma \left (\frac {\rho ^{\kappa }}{\gamma ^{\kappa }}+1\right)}$ provided $\frac {\kappa \rho ^{\kappa }}{\gamma ^{\kappa }}>1$.
For the family of PH models with covariates z=(z _{1},z _{2},…,z _{ p })^{′}, the hazard function for T can be expressed as
where h _{0}(t;α) is the baseline hazard function (i.e., the hazard function when z=0) characterized by the vector of parameters α, and β=(β _{1},β _{2},…,β _{ p })^{′} is the vector of regression coefficients. A fully parametric PH model can be formulated by specifying h _{0}(t;α) parametrically. If h _{0}(t;α) is specified by the generalized loglogistic hazard function (1), then (5) takes the form
where $\phantom {\dot {i}\!}\rho ^{*}=e^{\mathbf {z}'\boldsymbol {\beta }/\kappa }$. Thus the generalized loglogistic is closed under proportionality of hazards. Another widely used parametric PH family is the Weibull, for which h _{0}(t;α)=κ ρ(ρ t)^{κ}. Note that the Cox PH model is semiparametric, for which the baseline hazard function in (5) is left arbitrary and is denoted by h _{0}(t).
Estimation
Suppose that a censored random sample consisting of data (t _{ i },δ _{ i },z _{ i }), i=1,2,…,n, is available, where t _{ i } is a lifetime or censoring time according to whether δ _{ i }=1 or 0, respectively, and z _{ i }=(z _{ i1},z _{ i2},…,z _{ ip })^{′} is the vector of covariates for the i ^{th} individual. Letting $m=\sum _{i=1}^{n} \delta _{i}$, a _{ i }= exp(z i′β) and b _{ i }=(γ t _{ i })^{κ}, the loglikelihood function for the generalized loglogistic PH can be written as
where θ=(α ^{′},β ^{′})^{′}. The first derivatives of the loglikelihood function are
where c _{ i }= logb _{ i }/(1+b _{ i }) and d _{ i }=b _{ i }/(1+b _{ i }) (see Appendix). To improve the convergence of iterative procedures for maximum likelihood estimation and the accuracy of largesample methods, we remove range restrictions on parameters through the parameterizations α ^{∗}=(κ ^{∗},γ ^{∗},ρ ^{∗})^{′}, where κ ^{∗}= logκ, γ ^{∗}= logγ and ρ ^{∗}= logρ. The maximum likelihood estimate of θ ^{∗}=(α ^{∗} ^{′},β ^{′})^{′} can then be obtained by solving the equations ∂ ℓ(θ ^{∗})/∂ κ ^{∗}=0, ∂ ℓ(θ ^{∗})/∂ γ ^{∗}=0, ∂ ℓ(θ ^{∗})/∂ ρ ^{∗}=0 and ∂ ℓ(θ ^{∗})/∂ β _{ j }=0 iteratively, where (see Appendix)
Many software packages have reliable optimization procedures to maximize loglikelihood functions. We wrote our computer code in R (R Core Team 2016), and used the function nlminb for optimization (see the Additional file 1).
Initial values
We may use Weibull, loglogistic and Cox PH fits to generate initial values in solving the equations ∂ ℓ(θ ^{∗})/∂ κ ^{∗}=0, ∂ ℓ(θ ^{∗})/∂ γ ^{∗}=0, ∂ ℓ(θ ^{∗})/∂ ρ ^{∗}=0 and ∂ ℓ(θ ^{∗})/∂ β _{ j }=0. Let $\hat {\kappa }_{1}$ and $\hat {\rho }_{1}$ be the maximum likelihood estimates of the Weibull shape and scale parameters, respectively, $\hat {\kappa }_{2}$ and $\hat {\rho }_{2}$ the maximum likelihood estimates of the loglogistic shape and scale parameters, respectively, and $\hat {\boldsymbol {\beta }^{*}}$ the estimates of the regression coefficients for the Cox PH model. Note that maximum likelihood methods for the Weibull, loglogistic and Cox PH models are available in many statistical softwares, including R (R Core Team 2016). We propose to use $\log {\hat {\kappa }_{1}}$, $\log {\hat {\kappa }_{1}\hat {\kappa }_{2}}$, $\log {\hat {\rho }_{1}}$ and $\hat {\boldsymbol {\beta }^{*}}$ as initial values for κ ^{∗}, γ ^{∗}, ρ ^{∗} and β, respectively. If convergence is not achieved with these initial values, we propose to replace $\log {\hat {\kappa }_{1}}$ and $\log {\hat {\rho }_{1}}$ by $\log {\hat {\kappa }_{2}}$ and $\log {\hat {\rho }_{2}}$, respectively. In fitting the generalized loglogistic model to many data sets, we have not experienced any difficulty in obtaining convergence with this technique.
Tests and confidence intervals
Tests and interval estimates for the model parameters are based on the approximate normality of the maximum likelihood estimators. The asymptotic distribution of $\boldsymbol {\hat {\theta }}^{*}$ is approximately a (p+3)variate normal distribution with mean θ ^{∗} and covariance matrix $\Sigma =I(\boldsymbol {\hat {\theta }}^{*})^{1}$, where
is the (p+3)×(p+3) observed information matrix (second derivatives of ℓ(θ ^{∗}) are given in Appendix: Derivatives of the loglikelihood function). By the multivariate delta method, the asymptotic distribution of $\boldsymbol {\hat {\theta }}$ is also approximately normal with mean θ and covariance matrix D Σ D ^{′}, where D is the (p+3)×(p+3) diagonal matrix $diag(\boldsymbol {\hat {\alpha }},1,1,\ldots,1)$ and $\boldsymbol {\hat {\alpha }}=\exp {(\boldsymbol {\hat {\alpha }}^{*})}$.
Generalized loglogistic distribution in joint modeling
Joint models are used to quantify association between an internal timedependent covariate and time until an event of interest occurs (Wulfsohn and Tsiatis 1997). It involves two separate models: a model that takes into account measurement error in the timedependent covariate to estimate its true values (longitudinal model), and another model that uses these estimated values to quantify the association between this covariate and the time to the occurrence of the event (timetoevent model). The idea behind the joint modeling technique is to couple the timetoevent model with the longitudinal model. The general framework of the maximum likelihood method and large sample theory can be found in Rizopoulos (2012). Maximization of the loglikelihood function for joint modeling is computationally challenging, as it involves evaluating multiple integrals that do not have an analytical solution, except in very special cases. The R package JM has been developed by Rizopoulos (2010) to fit joint models using Weibull baseline hazard, piecewiseconstant baseline hazard, spline approximation of the baseline hazard and unspecified baseline hazard functions. We have modified the source codes for Weibull to fit joint models using the generalized loglogistic baseline hazard function. The application of the generalized loglogistic distribution in joint modeling is illustrated with an example in Section 1.
Goodness of fit
The nonparametric estimates are useful for assessing the quality of fit of a particular parametric timetoevent model (Lawless 2002). For a model without covariate, we use the approach to simultaneously examine plots of parametric and nonparametric estimates of the survival function, superimposed on the same graph. Let $S(t;\hat {\boldsymbol {\theta }})$ and $\hat {S}(t)$ be the estimates of the survivor functions based on the parametric model of interest and the KaplanMeier method (Kaplan and Meier 1958), respectively. The estimates $S(t;\hat {\boldsymbol {\theta }})$ as a function of t should be close to $\hat {S}(t)$ if the parametric model is adequate. For a model with covariates, we consider residual diagnostic plots, where the residuals are defined based on the cumulative hazard function H(t;θ). If $\hat {S}(H(t;\boldsymbol {\hat {\theta }}))$ is the KaplanMeier estimate of $H(t;\boldsymbol {\hat {\theta }})$, then a plot of $\log \hat {S}(H(t;\boldsymbol {\hat {\theta }}))$ versus $H(t;\boldsymbol {\hat {\theta }})$ should be roughly a straight line with unit slope when the model is adequate (Lawless 2002).
We also use the Akaike’s information criterion (AIC) (Akaike 1974) to compare the fits of different models. The AIC is defined by
where p is the number of covariates and k is the number of parameters of the assumed probability distribution (k=3 for the generalized loglogistic model). In general, when comparing two or more models, we prefer the one with the lowest AIC value. A rule of thumb is that if Δ _{M}=AIC_{M}−AIC_{min}>2, then there is considerably less support for Model M compared to the model with minimum AIC (Burnham and Anderson 2002).
Examples
Three data sets are taken from the literature to demonstrate the ability of the generalized loglogistic distribution in modeling timetoevent data. The application of the generalized loglogistic PH in joint modeling is illustrated using another data set of AIDS patients. We first use the scaled TTT transform of failure times to detect the shape of the hazard function (Mudholkar et al. 1996). The scaled TTT transform is given by $\phi (v/n)=\left [\sum _{i=1}^{v}{T_{(i)}}+(nv)T_{(v)}\right ]/\left (\sum _{i=1}^{n}{T_{(i)}}\right)$, where T _{(i)} represent the order statistics of the sample, and v=1,2,…,n. The hazard function is increasing, decreasing and unimodal if the plot of (v/n,ϕ(v/n)) is concave, convex, and concave followed by convex, respectively. For the first three examples (Sections Example 1: Head and neck cancer dataExample 3: Vaginal cancer mortality in rats), we first fit the generalized loglogistic, Weibull and loglogistic models (without covariate) and check the appropriateness of the distributional assumption using diagnostic plots. Then, we analyze the data using regression models, and compare the fits via residual plots. Note that the regression model based on the loglogistic distribution is given by logT=β _{0}+β _{1} z _{1}+…+β _{ p } z _{ p }+τ W where τ=1/κ, β _{0}=− logρ and W has the logistic distribution with density f(w)=e ^{w}/(1+e ^{w})^{2}, – ∞<w<∞. This model has an accelerated life interpretation (Lawless 2002), whereas the generalized loglogistic and Weibull PH models have relative risk interpretation. In the fourth example (Section Example 4: AIDS data), we consider joint models based on the generalized loglogistic, Weibull and piecewiseconstant baseline hazard functions.
Example 1: Head and neck cancer data
Data description, hazard shape and distributional assumption
Efron (1988) described a randomized clinical trial to compare radiation therapy alone (arm A) versus radiation plus chemotherapy (arm B) in treating head and neck cancer patients. Survival times (in days) for 51 patients in arm A (9 observations were censored) and 45 in arm B (14 were censored) were reported. The TTT plot in Fig. 1(a) suggests a unimodal hazard shape of the survival times. Plots of $S(t;\hat {\boldsymbol {\theta }})$ and $\hat {S}(t)$ (Fig. 2(a–c)) indicate more support for the generalized loglogistic distribution in comparison with the Weibull and loglogistic distributions in describing the head and neck cancer data.
Regression analysis
Letting z _{ i }= I(treatment = radiation therapy) that equals 1 if the treatment involves radiation therapy alone and 0 otherwise, we fit the generalized loglogistic PH, Weibull PH and loglogistic AFT models to the head and neck cancer data (numerical results are summarized in Table 1). The standard error of $\hat {\beta }$ for the generalized loglogistic model is smaller that those for the Weibull and loglogistic models, and therefore the generalized loglogistic would be preferred on grounds of efficiency. We also see that the generalized loglogistic has the lowest AIC value, which is supported by the residual plots (Fig. 2(d–f)): residuals lying closely to the unitslope line for generalized loglogistic indicate its superiority over the Weibull and loglogistic models. In summary, the generalized loglogistic fits the data adequately and is the best among the three models under consideration.
Example 2: Autologous and allogeneic bone marrow transplants
Data description, hazard shape and distributional assumption
Klein and Moeschberger (2003) described a study involving a sample of 101 patients with advanced acute myelogenous leukemia. Fiftyone of these patients had received an autologous (auto) bone marrow transplant, whereas 50 an allogeneic (allo) transplant. Survival times (in months) for 28 auto transplant and 22 allo transplant patients were censored. Careful inspection of the TTT plot in Fig. 1(b) reveals an indication of the unimodality of the hazard function. A comparison of the diagnostic plots (without covariate) in Fig. 3(a–c) suggests that the assumption of generalized loglogistic is more appropriate than the assumption of Weibull or loglogistic in describing these data.
Regression analysis
For regression analysis, we consider the covariate z _{ i }= I(transplant = allo). The fits via the generalized loglogistic PH, Weibull PH and loglogistic AFT are summarized in Table 2. The generalized loglogistic has the lowest AIC value, suggesting it produced the bestfitting model. The residual plots (Fig. 3(d–f)) also support this fact. It is interesting to note here that both the Weibull and loglogistic suggest a decreasing hazard function (estimate of the shape parameter is less than 1), whereas the generalized loglogistic captures the unimodal shape of the hazard function ($\hat {\kappa }=e^{0.2148}=1.24>1$).
Example 3: Vaginal cancer mortality in rats
Data description, hazard shape and distributional assumption
Pike (1966) described a laboratory experiment involving the development of vaginal cancer in rats insulted with the carcinogen DMBA. There were 19 rats in group 1 and 21 in group 2. Seventeen rats in group 1 and 19 in group 2 had developed tumours at the time the data were collected (i.e., two observations in each group were censored). There were reasonable scientific grounds for believing that there might be a threshold value before which no tumour could be detected. For this reason, Lawless (2002) considered the values t ^{′}=t−100 to analyze these data. We also consider here the transformed version of the original observations. The TTT plot in Fig. 1(c) suggests an increasing hazard function for T ^{′}. Figure 4(a–c) shows diagnostic plots for the generalized loglogistic, Weibull and loglogistic fits (without covariate). We see that the generalized loglogistic and Weibull fits are similar, and provide slightly better description of the data compared to the loglogistic model.
Regression analysis
For regression analysis, we consider the covariate z _{ i }= I(group = group 1). Table 3 gives the estimates of the parameters and associated standard errors from the generalized loglogistic, Weibull and loglogistic fits. As demonstrated by the TTT plot, the Weibull PH suggests an increasing hazard rate ($\hat {\kappa }=e^{1.1308}=3.098$). Note that a small value of $\hat {\gamma }$ ($\hat {\gamma }=e^{5.0190}=0.007$) and $\hat {\kappa }=e^{1.2568}=3.514>1$ for generalized loglogistic also support this fact. Although the AIC values (Table 3) suggest no obvious preference of one model over the other, the residual plots (Fig. 4(d–f)) clearly indicate more support for the generalized loglogistic and Weibull models. This example demonstrates that the generalized loglogistic has the ability to satisfactorily fit data which exhibit increasing hazard rates.
Example 4: AIDS data
Data description and hazard shape
This example illustrates the use of the generalized loglogistic distribution in joint modeling. Rizopoulos (2012) described a study involving 467 human immunodeficiency virus (HIV) infected patients who had failed or were intolerant to zidovudine therapy (ZT). The main objective was to compare two antiretroviral drugs to prevent the progression of HIV infections: didanosine (ddI) and zalcitabine (ddC). Patients were randomly assigned to receive either ddI or ddC and followed until death or the end of the study, resulting in 188 complete and 279 censored observations. It was also of interest to quantify the association between CD4 cell counts (internal timedependent covariate) measured at t=0, 2, 6, 12 and 18 months, and time to death. The TTT plot in Fig. 1(d) indicates an increasing hazard shape.
Regression analysis
For regression analysis, Rizopoulos (2012) considered joint models of the form
where (12) is the timetoevent model with drug_{ i }=I(drug = ddI), sex_{ i }=I(sex = male) and ZT_{ i }=I(ZT = failure); and (13) is the longitudinal model with b _{0}, b _{1} and b _{2} being the fixedeffects parameters, b _{0i } and b _{1i } the randomeffects parameters, and ε _{ i }(t) the random error component. We have reanalyzed the data here using generalized loglogistic, Weibull and piecewiseconstant (six knots placed at equally spaced percentiles of the observed event times (Rizopoulos 2012)) baseline hazard functions in (12). Note that h _{0}(t;α)=κ t ^{κ−1}/[1+(γ t)^{κ}] and κ t ^{κ−1} for generalized loglogistic and Weibull, respectively, and so β _{0}=κ logρ for both these models. For piecewiseconstant baseline hazard, $h_{0}{(t;\boldsymbol {\alpha })}=\sum _{q=1}^{7}\xi _{q}\mathrm {I}(v_{q1}<t\leq v_{q})$ and β _{0}=0 in (12), where 0=v _{0}<v _{1}<…<v _{7} is the split of the time scale and ξ _{ q } is the value of the hazard in the interval (v _{ q−1},v _{ q }]. The estimates of the parameters and standard errors for the timetoevent process are presented in Table 4. We see that the estimates of the coefficients (i.e., $\hat {\beta }_{1}$, $\hat {\beta }_{2}$, $\hat {\beta }_{3}$ and $\hat {\beta }_{4}$) and their standard errors are broadly similar under the three competing models. The AIC values and residual plots (Fig. 5) also suggest no obvious preference of one model over the other. Although we see no obvious preference of the generalized loglogistic model for this example for which the hazard function is monotone increasing, generalized loglogistic could be useful in joint modeling where the shape of the hazard function is unimodal.
Simulations
Four covariates in a PH regression framework were considered in all simulations: two continuous covariates (z _{1} and z _{2}), each generated from the standard normal distribution; and two binary covariates (z _{3} and z _{4}), each generated from the Bernoulli(0.5) distribution. Regression parameter values were chosen to be β=(0.50,−0.50,0.75,−0.75)^{′} corresponding to the covariate vector z=(z _{1},z _{2},z _{3},z _{4})^{′}. To evaluate the performance of the generalized loglogistic model, we considered three simulation scenarios based on the shape of the hazard function. For each scenario (see below), lifetime data were generated from the generalized Weibull distribution with probability density function
where ρ>0, κ>0 and −∞<γ<∞ are distributional parameters and α=(κ,γ,ρ)^{′}; the support of the distribution is t>0 for γ≤0 and $0<t<\frac {1}{\rho \gamma ^{\kappa }}$ for γ>0. Note that the hazard function of the generalized Weibull distribution is (a) monotone increasing for κ≥1 and γ≥1, (b) monotone decreasing for 0<κ≤1 and γ≤1, and (c) unimodal for κ>1 and γ<0. The simulation scenarios are then specified as follows.

Scenario 1: Decreasing hazard. Lifetimes were generated from generalized Weibull with κ=0.5, γ=−0.1 and ρ=0.1, and censoring times were generated from the exponential distribution with rate parameter λ=0.045.

Scenario 2: Increasing hazard. Lifetimes were generated from generalized Weibull with κ=2, γ=0.1 and ρ=0.1, and censoring times were generated from the exponential distribution with rate parameter λ=0.060.

Scenario 3: Unimodal hazard. Lifetimes were generated from generalized Weibull with κ=2, γ=−0.1 and ρ=0.1, and censoring times were generated from the exponential distribution with rate parameter λ=0.060.
Our choice of the parameter values led to, on average, 39.99, 40.69 and 42.99% censored observations for Scenarios 13, respectively. Given the covariates and censoring indicator, we then fit the generalized loglogistic, Weibull and Cox PH models to the simulated lifetimes. Note that since the Cox model is robust (usually fits the data well no matter which parametric model is appropriate), we consider this in our simulation study to compare model performance. For each scenario, 500 data sets (each of size n=100) were generated, and the average of each of the estimated model parameters across these data sets was calculated. Absolute bias (AB) and mean square error (MSE) were then computed for model comparison (numerical results are summarized in Table 5).
Results for scenario 1. For the continuous covariates (z _{1} and z _{2}), all three models produced estimates with similar MSE, whereas for the binary covariates (z _{3} and z _{4}), the generalized loglogistic demonstrated the smallest MSE. In terms of bias, generalized loglogistic and Cox PH were roughly equivalent, and both were superior to Weibull.
Results for scenario 2. For the regression coefficients, the generalized loglogistic produced estimates with the smallest bias. We also see that the generalized loglogistic and Weibull produced estimates with similar MSE, and both were superior to the Cox PH model. Note that the generalized loglogistic estimates for κ and γ were 2.269 and 0.006, respectively (i.e., the estimate of γ ^{κ} is close to zero), supporting the fact that the hazard function is monotone increasing.
Results for scenario 3. In terms of bias, the generalized loglogistic and Cox PH produced comparable estimates of the regression coefficients. However, the generalized loglogistic produced the most accurate estimates in terms of MSE, mostly as a consequence of smaller standard deviations of the estimates. As expected, the Weibull produced the least accurate estimates in terms of both bias and MSE for Scenario 3 (i.e., unimodal hazard).
A simulation study with about 20% censored observations per data set also led to similar conclusions (data not shown). In summary, our simulation study has demonstrated that the generalized loglogistic could potentially be a very useful parametric model to adequately describe different types of timetoevent data.
Conclusion
In this paper, we proposed a simple extension of the loglogistic distribution to a PH model by appending an additional parameter. As described in Section 1, the proposed model naturally accommodates decreasing and unimodal hazard functions. The loglogistic distribution is known to be useful to describe unimodal hazard functions (Lawless 2002). As demonstrated in Examples 1 and 2, it turns out that the generalized loglogistic may provide better fits in describing unimodal hazard functions compared to the loglogistic distribution. Moreover, our simulation study revealed that the generalized loglogistic could produce more accurate results compared to the Weibull and Cox PH models in describing monotone decreasing and unimodal hazard functions. In summary, the flexibility provided by the generalized loglogistic model could be very useful in adequately describing different types of timetoevent data.
Appendix: Derivatives of the loglikelihood function
Let $m=\sum _{i=1}^{n} \delta _{i}$, a _{ i }= exp(z i′β), b _{ i }=(γ t _{ i })^{κ}, c _{ i }= logb _{ i }/(1+b _{ i }) and d _{ i }=b _{ i }/(1+b _{ i }). We have

∙
$$ {\log{(\gamma t_{i})}= \frac{\log{b_{i}}}{\kappa},} $$(15) 
∙
$$ {(\gamma t_{i})^{\kappa} \log{(\gamma t_{i})}= \frac{b_{i}\log{b_{i}}}{\kappa},} $$(16) 
∙
$$ {\frac{\partial b_{i}}{\partial \kappa}=\frac{\partial}{\partial \kappa} (\gamma t_{i})^{\kappa} = (\gamma t_{i})^{\kappa} \log{(\gamma t_{i})}= \frac{b_{i}\log{b_{i}}}{\kappa},} $$(17) 
∙
$$ {\frac{\partial \log{b_{i}}}{\partial \kappa}= \frac{\log{b_{i}}}{\kappa},} $$(18) 
∙
$$ {\frac{\partial \log{(1+b_{i})}}{\partial \kappa}= \frac{b_{i}\log{b_{i}}}{\kappa(1+b_{i})}=\frac{b_{i}c_{i}}{\kappa},} $$(19) 
∙
$$ {\frac{\partial b_{i}\log{b_{i}}}{\partial \kappa}=\frac{b_{i}\log{b_{i}}}{\kappa}\log{b_{i}}+b_{i}\frac{\log{b_{i}}}{\kappa}=\frac{b_{i}(\log{b_{i}})(1+\log{b_{i}})}{\kappa},} $$(20) 
∙
$$ {\frac{\partial c_{i}}{\partial \kappa}=\frac{\partial}{\partial \kappa} \frac{\log{b_{i}}}{1+b_{i}} =\frac{\log{b_{i}}}{\kappa(1+b_{i})}\left(1\frac{b_{i}\log{b_{i}}}{1+b_{i}}\right) =\frac{c_{i}(1b_{i}c_{i})}{\kappa},} $$(21) 
∙
$$ {\frac{\partial b_{i}c_{i}}{\partial \kappa}=\frac{b_{i}c_{i}(1b_{i}c_{i}+\log{b_{i}})}{\kappa} = \frac{b_{i}c_{i}(1+c_{i})}{\kappa},} $$(22) 
∙
$$ {\frac{\partial d_{i}}{\partial \kappa}=\frac{\partial}{\partial \kappa}\frac{b_{i}}{1+b_{i}}=\frac{b_{i}\log{b_{i}}}{\kappa(1+b_{i})}\left(1\frac{b_{i}}{1+b_{i}}\right)= \frac{c_{i}d_{i}}{\kappa},} $$(23) 
∙
$$ {\frac{\partial \log{(1d_{i})}}{\partial \kappa}= \frac{\partial}{\partial \kappa} \log{(1+b_{i})^{1}}=\frac{\partial}{\partial \kappa} \log{(1+b_{i})}=\frac{b_{i}c_{i}}{\kappa},} $$(24) 
∙
$$ {\frac{\partial b_{i}}{\partial \gamma}=\frac{\partial}{\partial \gamma} (\gamma t_{i})^{\kappa} = \kappa\gamma^{\kappa1}t_{i}^{\kappa}=\frac{\kappa}{\gamma}b_{i},} $$(25) 
∙
$$ {\frac{\partial d_{i}}{\partial \gamma}=\frac{\partial}{\partial \gamma} \frac{b_{i}}{1+b_{i}}=\frac{\kappa}{\gamma}\frac{b_{i}}{1+b_{i}}\left(1\frac{b_{i}}{1+b_{i}}\right)=\frac{\kappa}{\gamma}~d_{i}(1d_{i}),} $$(26) 
∙
$$ {\frac{\partial}{\partial \gamma} \log{(1d_{i})} =\frac{\partial}{\partial \gamma}\log{(1+b_{i})}=\frac{\kappa}{\gamma}~\frac{b_{i}}{1+b_{i}} = \frac{\kappa}{\gamma}~d_{i}.} $$(27)
Using (7) and (15)(27), we can derive the first and second derivatives of the loglikelihood function as follows.
The maximum likelihood estimate of θ ^{∗}=(α ^{∗} ^{′},β ^{′})^{′} is obtained by solving the equations ∂ ℓ(θ ^{∗})/∂ κ ^{∗}=0, ∂ ℓ(θ ^{∗})/∂ γ ^{∗}=0, ∂ ℓ(θ ^{∗})/∂ ρ ^{∗}=0 and ∂ ℓ(θ ^{∗})/∂ β _{ j }=0 iteratively. The first and second derivatives of ℓ(θ ^{∗}) can be derived by noting that
Using (29)(30), the first and second derivatives of ℓ(θ ^{∗}) can be expressed as
References
Akaike, H: A new look at the statistical model identification. IEEE Trans. Autom. Control. 19, 716–723 (1974).
Burnham, KP, Anderson, DR: Model Selection and Multimodel Inference: A Practical InformationTheoretic Approach. Springer, New York (2002).
Collett, D: Modelling Survival Data in Medical Research. Chapman and Hall/CRC, Florida (2003).
Cox, DR: Regression models and lifetables. J. R. Stat. Soc. Ser. B. 34, 187–220 (1972).
Efron, B: The efficiency of Cox’s likelihood function for censored data. J. Am. Stat. Assoc. 72, 557–565 (1977).
Efron, B: Logistic regression, survival analysis, and the kaplanmeier curve. J. Am. Stat. Assoc. 83, 414–425 (1988).
Hsieh, F, Tseng, YK, Wang, JL: Joint modeling of survival and longitudinal data: Likelihood approach revisited. Biometrics. 62, 1037–1043 (2006).
Hwang, BS, Pennell, ML: Semiparametric bayesian joint modeling of a binary and continuous outcome with applications in toxicological risk assessment. Stat. Med. 33, 1162–1175 (2014).
Kalbfleisch, JD, Prentice, RL: The Statistical Analysis of Failure Time Data. John Wiley & Sons, New Jersey (2002).
Kaplan, EL, Meier, P: Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 53, 457–481 (1958).
Klein, JP, Moeschberger, ML: Survival Analysis: Techniques for Censored and Truncated Data. Springer, New York (2003).
Kleinbaum, DG, Klein, M: Survival Analysis: A SelfLearning Text. Springer, New York (2012).
Lawless, JF: Statistical Models and Methods for Lifetime Data. John Wiley & Sons, New Jersey (2002).
Mudholkar, GS, Srivastava, DK, Kollia, GD: A generalization of the weibull distribution with application to the analysis of survival data. J. Am. Stat. Assoc. 91, 1575–1583 (1996).
Nardi, A, Schemper, M: Comparing Cox and parametric models in clinical studies. Stat. Med. 22, 3597–3610 (2003).
Oakes, D: The asymptotic information in censored survival data. Biometrika. 64, 441–448 (1977).
Pike, MC: A method of analysis of certain classes of experiments in carcinogenesis. Biometrics. 22, 142–161 (1966).
R Core Team: R: A Language and Environment for Statistical Computing. Foundation for Statistical Computing, R. R Core Team, Vienna (2016).
Rizopoulos, D: Jm: An r package for the joint modelling of longitudinal and timetoevent data. J. Stat. Softw. 35, 1–33 (2010).
Rizopoulos, D: Joint Models for Longitudinal and TimetoEvent Data With Applications in R. Chapman and Hall/CRC, Florida (2012).
Wang, Y, Hossain, AM, Zimmer, WJ: Useful properties of the threeparameter Burr XII distribution. In: Ahsanullah, M (ed.)Applied Statistics Research Progress, pp. 11–20. Nova Science Publishers, New York (2008).
Wulfsohn, MS, Tsiatis, AA: A joint model for survival and longitudinal data measured with errror. Biometrics. 53, 330–339 (1997).
Acknowledgements
The authors acknowledge the comments and suggestions of the editor and the reviewers. This work was partially supported by NSERC through Discovery Grant (#368532) to SA Khan, and the University of Saskatchewan through New Faculty Startup Operating Fund to SA Khan.
Authors’ contributions
SAK, the principal investigator, conceptually developed the proposed distribution with related mathematical results and R code for computation, and drafted the manuscript. SKK contributed in developing mathematical results and writing Sections 13. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Author information
Additional file
Additional file 1
Supporting document – R code to fit the models. (PDF 106 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Cox PH
 Loglogistic distribution
 Parametric model
 Proportional hazard
 Semiparametric model
 Timetoevent data
 Weibull distribution
AMS Subject Classification
 Primary 62N01; Secondary 62P10