# Nonlinear mixed-effects HIV dynamic models with considering left-censored measurements

## Abstract

HIV dynamic model offers a different perspective of studying HIV pathogenesis and developing treatment strategies for AIDS patients. Many HIV dynamic models have recently been developed to characterize short-term AIDS treatment, whereas in long-term HIV dynamics, viral load often rebounds in the later stage of treatment primarily due to reduced drug efficacy. Although time-varying drug efficacy can be incorporated into the ordinary differential equations (ODE) model, such a system has no analytical solution, and the measurement of viral load is usually censored at the detection limit due to technological constraints. We consider nonlinear mixed-effects ODE model with stochastic approximation EM algorithm to overcome these difficulties. The performance of the proposed method is illustrated by means of a simulation study and a real-data application. Numerical evidence shows that the HIV infection is generally more severe when considering left-censored data. The T cell production rate from human body source varies, but the death rate of infected T cells, infection rate of virus, and other dynamic parameters do not have much difference among patients. We hope these findings inspire more research on clarifying biological mechanism of HIV infection and developing better treatment.

## 1 Introduction

HIV dynamic model, a set of ordinary differential equations (ODE) that describes the interaction between HIV virus and human body cells, has been proven useful for understanding the pathogenesis of HIV infection and developing treatment strategies. In order to estimate biologically/clinically meaningful parameters in the HIV dynamic model, many statistical models have been developed in the last decade, ranging from the simple nonlinear least squares (NLS) approach to more general nonlinear mixed-effect modeling approaches (Nowak and May2000; Perelson and Nelson1999; Perelson et al.1996; Perelson et al.1997; Wu and Ding1999; Wu et al.1998; Wu et al.2005). However, in more complex scenarios, the ODE system has no close-form solution and thus needs to be solved numerically. Note that data from viral dynamic studies usually consist of repeated viral load measurements taken over time for each subject. In addition, the viral dynamic processes share certain similar patterns among patients while still having distinct individual characteristics. These properties indicate that the nonlinear mixed-effects (NLME) models appear to be reasonable for modeling HIV dynamics.

Disparate from short-term (a few hours) AIDS treatment, viral load remains sustained, whereas in long-term treatment (a few weeks to years), viral load usually decreases at the beginning of treatment and is followed by rebounds later, which leads to the treatment failure. There are various reasons that could contribute to the failure; one of which is the non-constant drug efficacy due to time-varying drug resistance, drug adherence or/and pharmacokinetical factors, such as median inhibitory concentration (IC50), and so on. Huang et al. (2006) incorporated these clinical/PK factors into a time-varying drug efficacy model to characterize viral load resurgence featured in the long-term (24 weeks) AIDS treatment trajectory. This approach is capable of capturing the long-term treatment effect, but the difficulty may arise for the case in which the solution of ODE can not be linearized, because analytical formula does not exist. Accordingly, the classical nonlinear mixed-effects modeling approach becomes futile under this scenario.

It is remarkable that only a few methods have recently been proposed to tackle this problem in the longitudinal setting. For example, Huang et al. (2006) combined the Bayesian approach with the mixed-effects model to estimate both population and individual parameters within a framework of hierarchical Bayesian nonlinear models. From a frequentist point of view, Guedj et al. (2007) adapted a Newton-like approach using only first derivative to deal with the problem. Later on, in the presence of left-censoring, Huang and Getachew (2012) investigated the NLME model with a skew-t (ST) distribution for the response process and the ST nonparametric mixed-effects model for covariate measurement error process. However, the first approach ignores below detection limit viral load data but simply imputes the missing data with detection limit. The second approach requires reiteratively evaluating integral through an adaptive Gaussian quadrature approach, which quickly becomes intractable when the dimension of random effects is high. In addition, the computation of the approach is very complicated. The third approach is only applicable to short-term HIV dynamics and cannot be directly applied for long-term HIV dynamics. In particular, we may face severe bias in parameter estimation if we simply ignore censoring (below detection limit) or do not carefully handle it. As an illustration, in the simulation study performed later, we show that either imputing censored viral load with detection limit or completely removing those censored data will lead to large bias.

A classical way to deal with the incomplete or missing data problem is the Expectation-maximization (EM) algorithm, originally proposed by Dempster et al. (1977). For the E-step of the EM algorithm, one needs to compute the expectation of the complete log-likelihood with respect to missing data distribution at current estimates. The M-step includes the maximization of the expected log-likelihood function to update unknown parameters. One then follows iteration method until convergence is achieved. Within the mixed-effects modeling framework, the random effects can be treated as missing data so that the EM algorithm is capable of estimating fixed effects after averaging random effects in the E-step. Also, the below detection limit data are missing and can thus be averaged along with random effects at the E-step. Nevertheless, the E-step usually fails quickly when the number of censored data and/or the dimension of random effects increases since numerical integration becomes intractable in high dimensional situations.

To circumvent this difficulty, within linear mixed-effects model setting, Hughes (1999) used Monte Carlo integration to evaluate the E-step by repeatedly sampling from the conditional distribution of below detection data. Wu (2002,2004) extended the Monte Carlo version of EM (MCEM) for the nonlinear mixed-effects model. However, sampling the missing data is often time consuming and the algorithm is also slow to converge. Delyon et al. (1999) proposed a stochastic approximation version of EM (SAEM) to obtain the maximum likelihood estimate (MLE) of the unknown parameter. Kuhn and Lavielle (2005) coupled the Markov chain Monte Carlo (MCMC) procedure to the SAEM so that only one simulation of missing data at each iteration is required. The algorithm has been proven to converge quickly toward the MLE under general conditions. For theoretical proof, we refer readers to Delyon et al. (1999) and Kuhn and Lavielle (2005). In this paper, we couple the SAEM algorithm with the nonlinear mixed-effects model under the scenarios in which the ODE system has no analytical solution and the data from long-term AIDS treatment are left-censored. The performance of the proposed method is illustrated by means of a simulation study and a real-data application. Numerical evidence shows that the HIV infection is generally more severe when considering left-censored data.

The remainder of the paper is organized as follows. In Section 2, we introduce HIV dynamic model and time-varying drug efficacy model, which are used to model long-term AIDS treatment. In Section 3, the statistical model and method of estimating parameters are presented. In Section 4, we carry out a simulation study to examine the performance of the proposed method and in Section 5, we evaluate the performance of our method via analysis of the real AIDS clinical trial study. We conclude with a brief discussion in Section 6.

## 2 Model specification

### 2.1 Antiviral response model

One of the commonly used HIV dynamic models that describes the interaction between human body cells and virus is given by (Huang et al.2006):

$d dt T ( t ) = λ - ρ T ( t ) - [ 1 - γ ( t ) ] kT ( t ) V ( t ) , d dt T ∗ ( t ) = [ 1 - γ ( t ) ] kT ( t ) V ( t ) - δ T ( t ) ∗ , d dt V ( t ) = N δ T ∗ ( t ) - cV ( t ) ,$
(1)

where T(t), T(t), and V(t) stand for uninfected T cells, infected T cells, and virus at time t, respectively, λ represents the rate at which new T cells are generated from body source like thymus, ρ is the death rate of uninfected T cells, k is the infection rate of T cells, δ is the death rate of infected T cells, N is the number of new virions produced from each infected T cells during their life-time, and c is the clearance rate of free virions. The time-varying parameter γ(t) is the antiviral drug efficacy as defined below. In this paper, we assume that the system of equations (1) is in a steady-state before initiating antiretroviral treatment, and then the initial conditions for (1) are given by

$T 0 = c kN , T 0 ∗ = c V 0 δ N ,and V 0 = λ N c - ρ k .$
(2)

The above ordinary differential equations have no analytical solutions and thus need to be solved numerically. One feature that distinguishes this dynamic model from others is that by considering time-varying drug efficacy, we are capable of characterizing rebounds of virus in long-term antiviral treatment, which is not seen typically in the short-term period. On the other hand, more complexity may be added to the model by incorporating various clinical or/and pharmacokinetical (PK) factors such as drug susceptibility, drug resistance, and drug adherence etc. Thereafter, it becomes more challenging to carry out statistical inference.

### 2.2 Antiviral drug efficacy model

In many studies, drug efficacy is often assumed to be either a constant or perfect over treatment time; see, for example, Perelson et al. (1997) Perelson and Nelson (1999), Ding and Wu (2000), among others. Such an assumption may be reasonable during short-term period, whereas in long-term treatment, drug efficacy varies across time due to drug resistance, imperfect drug adherence or/and other clinical/pharmacological factors, which could lead to treatment failure. To deal with this issue, we advocate the modified E max model to account for time-varying drug efficacy (Huang et al.2006). In this paper, we consider two antiretroviral drugs given by

$γ(t)= C 1 A 1 ( t ) / IC 50 1 ( t ) + C 2 A 2 ( t ) / IC 50 2 ( t ) ϕ + C 1 A 1 ( t ) / IC 50 1 ( t ) + C 2 A 2 ( t ) / IC 50 2 ( t ) ,$
(3)

where C1 and C2 represent drug concentration or any PK parameters such as C12h. The median inhibitory concentrations$IC 50 1 (t)$ and$IC 50 2 (t)$ are used to quantify drug susceptibility (Molla et al.1996) and are written as

$IC 50 (t)= I 0 + I r - I 0 t r t for 0 < t < t r , I r for t ≥ t r ,$
(4)

where I0 and I r are respective values of IC50(t) at the baseline and time point t r , which is the time of virological failure. Let ϕ be a conversion factor between IC50(t)in vitro and IC50(t)in vivo and it can be estimated from data. Here, A1(t) and A2(t) are adherence profiles of the two drugs, which are measured by pill counts and are modeled as

$A(t)= 1 if all doses are taken in ( T k , T k + 1 ] , R if 100 R % doses are taken in ( T k , T k + 1 ] ,$
(5)

where T k is the time at the k th clinical visit of patients, R k represents the percentage of drugs that is consumed by patients during the interval of two visits. In particular, if all of the prescribed drugs are taken, R k is one. This suggests that poor adherence is one of the major causes of treatment failure, and thus it is definitely necessary to take this factor into account; see, for example, Besch (1995).

## 3 Statistical model and parameter estimation

### 3.1 Nonlinear mixed-effects ODE model

Viral load data from an HIV/AIDS clinical trial are composed of repeated measurements on a group of patients so that a hierarchical modeling approach is necessary to account for within subject as well as between subject variation simultaneously. We are interested in estimating biologically/clinically meaningful parameters in (1) and conducting statistical inferences while taking below detection limit measurements for all patients into account. In the area of longitudinal data analysis, the mixed-effects model is often used to characterize both within and between subjects variation. Let y ij be logarithmic measurement of viral load for subject i at time t ij for i = 1,,n and j = 1,,n i ; population parameter μ = (log c, log δ, log λ, log ρ, log N, log k, log ϕ)T; individual parameter θ i  = (log c i , log δ i , log λ i , log ρ i , log N i , log k i , log ϕ i )T. Let g(θ i ,t ij ) = log10(V ij (θ i ,t ij )) with V ij (θ i ,t ij ) being the true amount of viral load based on (1) for subject i at time t ij . Following Davidian and Giltinan (1995), a natural Nonlinear Mixed-effects ODE Model (NLME-ODE) is modeled as

1. (i)

Within-subject variation: y ij  = g(θ i ,t ij ) + ε ij . Measurement error ε ij is assumed to follow a normal distribution with mean zero and variance σ 2.

2. (ii)

Between-subject variation: θ i  = μ + b i . Random effect b i characterizes the deviation of individual parameters from population level and we assume $b i ∼N(0,D)$.

It deserves mentioning that the model described here is different from the classical NLME model in that g(·) has no explicit form in the long-term HIV dynamic setting, which leads to additional challenges for estimating parameters. A few methods such as the Bayesian approach and a Newton-like method were proposed to attack these challenges; see, for example, Huang et al. (2006), Guedj et al. (2007). However, these procedures either ignore missing (under detection limit) mechanism or are not applicable in the high dimensional setting. We here advocate the SAEM algorithm to take below detection limit data for parameter estimation into account in the longitudinal HIV dynamic model.

### 3.2 Parameter estimation

Both random effects from NLME-ODE and below detection limit (left-censoring) viral load can be treated as missing data. EM algorithm is able to deal with missing data where the log-likelihood with regard to missing components distribution is obtained at the E-step and the parameter estimates are updated through maximization (M-step). In light of high dimensionality of random effects and censored data, SAEM algorithm coupled with MCMC provides a convenient way of drawing samples at the E-step (Delyon et al.1999; Kuhn and Lavielle2005). The detail of this method is described below.

#### 3.2.1 A. Expectation

From one point of view, both individual parameter θ i and below detection limit data can be treated as missing data. A classical way to cope with missing data is the EM algorithm proposed by Dempster et al. (1977). Let θ = (θ1,,θ n ). Denote Yo and Ym as observations beyond and below detection limit for all subjects across the study period, respectively. Let L(Yo,Ym,θ;μ,D,σ2) represent the complete data likelihood. The MLE of (μ,D,σ2) is determined by the marginal likelihood of the observed data L(Yo;μ,D,σ2), whereas this quantity is often intractable. As an alternative, the EM algorithm calculates the expected value of the complete log-likelihood function, with respect to the joint conditional distribution of Ym,θ given Yo under the current estimates of the parameter (μ(k),D(k),σ2 (k))

$Q μ , D , σ 2 ∣ μ ( k ) , D ( k ) , σ 2 (k) = E Y m , θ ∣ Y o , μ ( k ) , D ( k ) , σ 2 (k) log L ( μ , D , σ 2 ; Y o , Y m , θ ) .$
(6)

For simplicity of notation, let I o  = {(i,j);y ij  ≥ DL} with DL being the detection limit, and$y ij o$ be the corresponding observation such that I m  = {(i,j);y ij  < DL}, and$y ij m$ be the corresponding missing data. Also, let n t be the total number of observations and n s be the number of subjects. In the long-term HIV treatment, it follows

(7)

Of particular note is that when g(·) is a linear function of θ, it is easy to obtain that equation (6) follows a normal distribution. For our problem, g(·) is the result of numerically integrating the ODE system in (1), which is not only a nonlinear function of θ, but the close-form expression does not exist as well. Accordingly, we follow the idea of a stochastic version of EM algorithm (SAEM) (Delyon et al.1999) and evaluate equation (6) as follows.

#### 3.2.2 B. Gibbs sampler for incomplete data

It has been long known that Gibbs sampler is useful for simulating data from the joint posterior distribution (Gelfand et al.1990; Wakefield1996). In our case, at the k th iteration, θ and Ym can be alternatively generated from the joint posterior distribution P(θ,YmYo,μ(k-1),D(k-1),σ2(k-1)) summarized in the following two steps.

Step 1 Simulate Ym(k) from the marginal conditional posterior distribution P(Ymθ(k-1),Yo,μ(k-1),D(k-1),σ2(k-1)) which follows a normal distribution truncated at the detection limit. Each$y ij m ( k )$ is then centered at$g θ i ( k - 1 ) , t ij$ with variance σ2(k-1) and can thus be simulated as follows (Breslaw1994):

1. (a)

calculate the cumulative probability of the detection limit under the same distribution as $y ij m ( k )$ and denote as P DL;

2. (b)

draw u from the uniform distribution U(0, 1); and

3. (c)

obtain a sample of $y ij m ( k )$ as $y ij m ( k ) =g( θ i ( k - 1 ) , t ij )+ σ ( k - 1 ) Φ - 1 [u× P DL ]$, where Φ is the standard normal cumulative distribution function.

It should be noted that this sampling algorithm requires only one draw at each iteration therefore is efficient.

Step 2 Simulate θk from the conditional posterior distribution P(θYm(k),Yo,μ(k-1),D(k-1),σ2(k-1)), which has no close-form formula but is proportional to (7) with all the parameters given at current values. The Metropolis-Hastings (M-H) algorithm is capable of generating samples from this distribution. Indeed, one choice of the proposal distribution is$q k - 1 ∼N( μ ( k - 1 ) , D ( k - 1 ) )$. Then the procedure proceeds as follows

1. (a)

Calculate acceptance probability α(φ|θ (k-1)) as

$min 1 , P φ ∣ Y m ( k ) , Y o , μ ( k - 1 ) , D ( k - 1 ) , σ 2(k-1) P θ ( k - 1 ) ∣ Y m ( k ) , Y o , μ ( k - 1 ) , D ( k - 1 ) , σ 2(k-1) q ( k - 1 ) θ ( k - 1 ) ∣ μ ( k - 1 ) , D ( k - 1 ) q ( k - 1 ) φ ∣ μ ( k - 1 ) , D ( k - 1 ) ,$
(8)

where φ is a candidate simulated from qk-1. If we assume that each θ i is independent, then D is diagonal. We may thus simulate φ for each i (denote as φ i ) separately. After some arrangements, the acceptance probability α is simplified as

$α( φ i ∣ θ ( k - 1 ) )=min 1 , R i ,$
(9)

where

(10)
1. (b)

For each i, draw u from the uniform distribution U(0, 1). If u ≤ α(φ i|θ (k-1)), then accept φ i as new $θ i ( k )$; otherwise keep $θ i ( k - 1 )$ as $θ i ( k )$.

Notice that, unlike the implementation of M-H algorithm in other cases in which the choice of variance for proposal density q is essential to the efficiency of the algorithm. Herein, there is no need to choose an appropriate variance manually. Since variance D is always estimated from the last iteration, the algorithm updates the proposal variance automatically to make itself adaptive. Often, the candidate parameter φ simulated from q makes the integration of (1) unstable, which is the so-called stiffness. To handle the stiff ODEs, we apply a Rosenbrock method, which is relatively easy to implement and also provides a good accuracy. We refer the interested readers to Kaps and Rentrop (1979) for more details.

#### 3.2.3 C. Maximization

Once θ and Ym are simulated, it is straightforward to update (μ,D,σ2) by maximizing equation (6) and we obtain

$μ ( k ) = 1 n s Σ i θ i ( k ) , diag D ( k ) = 1 n s Σ i θ i 2 ( k ) - μ 2 ( k ) , σ 2(k) = 1 n t Σ i , j ∈ I o [ y ij o - g ( θ i ( k ) , t ij ) ] 2 + Σ i , j ∈ I m [ y ij m ( k ) - g ( θ i ( k ) , t ij ) ] 2 .$
(11)

Note that the above estimates are composed of minimum sufficient statistics of (μ,D,σ2). Denote s1 = Σ i θ i ,$s 2 = Σ i θ i 2$, and$s 3 = Σ i , j ∈ I o [ y ij o - g ( θ i , t ij ) ] 2 + Σ i , j ∈ I m [ y ij m - g ( θ i , t ij ) ] 2$. Stochastic approximation step of SAEM is composed of updating s1,s2, and s3 with a sequence γ(k) at the k th iteration$s i ( k ) =(1- γ ( k ) ) s i ( k - 1 ) + γ ( k ) s i ( k )$. Kuhn and Lavielle (2005) recommended to use γ(k) = 1 for the first K1 iterations followed by diminishing γ(k) = 1/(k - K 1) for another K2 iterations in order to satisfy the assumptions of SAEM and to ensure the convergence of the algorithm. It deserves mentioning that the estimates of variance can be easily obtained by the inverse of the observed fisher information matrix of (7).

## 4 Simulation

We illustrate the effectiveness of our method via analysis of an example in the long-term HIV dynamic study with considering below detection limit measurements. Note that the simulation experiment is similar to an actual AIDS clinical trial. Basically, 40 patients were involved in a long-term antiretroviral treatment and then followed up to 200 days. Some patients were loss to follow up due to various reasons as in the real study. Consequently, the simulated viral load data are unbalanced longitudinal measurements. All data for the pharmacokinetic factor (C1,C2), phenotype marker (baseline and failure I C50s) and adherence as well as the baseline viral load (V0) were taken from an AIDS clinical trial study.

In order to avoid parameter identifiability problem, ρ,N, and K are fixed at 0.1, 980, and 0.0001, respectively. These values are chosen from previous studies in the literature (Ding and Wu2000; Nowak and May2000; Perelson and Nelson1999). In addition, we assume the system is at steady state at the beginning of the treatment. This indicates that only three parameters in (1) need to be estimated. Denote μ = (logc, logδ, logϕ)T and θ i  = (logc i , logδ i , logϕ i )T. The diagonal covariance matrix D has a vector of diagonal elements$d 2 = ( d c 2 , d δ 2 , d ϕ 2 ) T$. The value of logλ is obtained from (2). Altogether, there are totally seven parameters in the model.

The true values for μ = (logc, logδ, logϕ)T and$d 2 = ( d c 2 , d δ 2 , d ϕ 2 ) T$ are set to be (1.1,-1,3)T and (0.25,0.04,0.0625)T, respectively. These values are similar to the ones used in the simulation study of Huang et al. (2006). The true value for σ2 is 0.1 and the detection limit is 25 as in the real study. The true values of parameters θ i for i = 1,,40 are simulated from$N(μ,D)$ independently. Given PK/clinial data (C,IC50,A(t)) and true parameters (θ i ) as well as initial viral load (V0), viral load V ij are generated by numerically solving the ODE system in (1) for each individual. A measurement error term ε ij is then added to logV ij to mimic observation data logy ij for the i th subject at time t ij , where ε ij N(0,σ2). We then censor the viral load data at detection limit 25, producing totally 23 percent of below detection data which are imputed at 25 as observed in a real experiment.

We examine the effect of taking into account censored data or not on parameter estimates. First, the data below detection limit is completely hold out before fitting model, named as ‘case A’; second, the below detection limit data are retained but assumed 25 as the true value, named as ‘case B’; third, the below detection limit data are considered as censored data, named as ‘case C’. For each case, we fit the nonlinear mixed-effects ODE model coupled with the SAEM algorithm. However, in cases A and B, the data are fit as they are, i.e., the data are considered as complete while censoring mechanisms are accounted for in case C. In this simulation study, K1 is set to be 10,000 to make sure the estimates converging to the neighborhood of MLE, then a decreasing step size γ is used for another 2,000 steps to fine-tune the final estimates. The same generating and model fitting processes are repeated 100 times. The performance of the proposed method on three cases is measured with respect to the bias and square root of MSE (RMSE).

The simulation results are presented in Table1. In all three cases, the biases for inter-subject variance components are significantly larger than the ones of fixed effects. Biases for$d c 2$ in all the three cases exceed 45%. The biases for variance estimates are especially large (almost 500% for ‘case A’ and ‘case B’; 83% for ‘case C’). For fixed effects, case A is superior to case B in terms of both bias and RMSE. Among all the three cases, case C has the least bias and RMSE for all of the seven parameters, thus performs the best. In general, imputing censored viral load with the detection limit (25) leads to larger bias for both fixed effects and random effects variance estimates. Our simulation results show that simply omitting censored observations (case A) is even better than retaining the censored values (case B) in terms of parameter estimating accuracy. Handling below detection limit data with the proposed method gives us the least bias in the current setting.

We also compared the fitting results with the Bayesian method and Gaussian quadrature method. We observed that for cases A and B, all the three methods provide similar estimates (Table2). However, the Bayesian method fails to handle with below detection data so that it can not deal with case C. In addition, we also tried to fit case C with the Gaussian quadrature method. Unfortunately, this method does not converge in our case.

To test the robustness of parameter estimates with regard to initial values, we carry out a sensitivity analysis. Three sets of initial settings are examined. One sets the mean vector μ = (logc, logδ, logϕ)T at the true values (1.1,-1,3)T, the other two set μ at the two extremes (3,-0.5,5)T and (0.2,-0.2,0.6)T, separately. Variance components$d 2 = ( d c 2 , d δ 2 , d ϕ 2 ) T$ are also increased to (5,5,5)T and (10,10,10)T, respectively. We then estimate parameters using the same method as above. The first 10,000 steps are followed by additional 2,000 steps. Figure1 depicts the graphical diagnostic of convergence of SAEM algorithm for the three cases. As shown clearly, after the first 10,000 steps, all three chains converge to the neighborhood of true values and 2,000 more steps make the estimate more precise. Based on these results, the different initial values give similar parameter estimates, therefore the method is quite robust. We may thus conclude that parameter estimates using the proposed method are insensitive to the choices of initial values.

## 5 A real AIDS clinical study

In this section, we apply the proposed method to a real AIDS clinical trial study. This study was a Phase I/II, randomized, open-label, 24-week comparative study of the PK, tolerability and antiretroviral effects of two regimens of indinavir (IDV), ritonavir (RTV), plus two nucleoside analogue reverse transcriptase inhibitors (NRTIs) on HIV-1-infected subjects failing protease inhibitor (PI) containing antiretroviral therapies. The 44 subjects were randomly assigned to the two treatment regimens, Arm A (IDV 800 mg q12h+RTV 200 mg q12 h), and Arm B (IDV 400 mg q12h+RTV 400 mg q12h). Study visits occurred at pre-entry, entry (within 14 days of pre-entry) and days 7, 14, 28, 56, 84, 112, 140 and 168 of follow-up. Plasma HIV RNA testing was conducted at each study visit. The detection limit is 25 cp/ml. Clinical assessment and laboratory parameters were performed at all visit weeks with the exception of week 1. Phenotypic determination (IC50) of antiretroviral drug resistance was performed at baseline and at the time of virological failure. The PK parameters of IDV and RTV were determined using noncompartmental methods. Pill counts were measured to monitor adherence at each study visit from weeks 2 to 24. Of the 44 subjects, 42 subjects were included in this analysis; of the remaining two subjects, one was excluded from the analysis because the PK parameters were not obtained and the other was excluded because the phenotype assay could not be completed on this subject. Acosta et al. (2004) gives more detailed descriptions of this study.

The analysis here is to investigate the effect of missing data on the parameter estimation of HIV dynamic model (1). There are totally 14 percent (45 out of 328) of below detection observations in this real study. We compare the parameter estimate results based on the full data, with and without taking censoring mechanism into account. To avoid parameter identifiability problem, we fix c = 3.0 and N = 980; see Stafford et al. (2000). There are totally five fixed effects and six variance parameters (five from random effects and one from measurement error). We tried different initial values which all gave similar results. Table3 presents parameter estimates and corresponding standard deviation for the 11 parameters.

The biological and clinical interpretation of estimating results are quite interesting. Compared with the estimate with considering censoring mechanism, the death rate of infected cell (T) is overestimated (logδ = -1.613 vs -1.801); the death rate of uninfected cell (T) is underestimated (logρ = -3.798 vs -3.563); the infection rate of T cells by virus is underestimated (logk = -11.87 vs -9.397); conversion factor is overestimated (logϕ = 1.395 vs 1.02); the production rate of T cells from body source is overestimated (logλ = 5.68 vs 4.876) from the model without taking censoring into account. As a result, the comparison above consistently indicates that the degree of infection is underestimated by model without considering censoring if we define high degree of infection as low death rate of infected cell (T), high death rate of uninfected cell (T), high infection rate of T cells by virus, low production rate of T cells from body source. In other words, the degree of infection is actually more severe if censored data are taken care of. As one can see, these results are logical and meaningful. Moreover, as suggested by (3), conversion factor ϕ is negatively associated with drug efficacy. By considering censoring, the drug efficacy is higher, which is suggested by lower ϕ. Alternatively, the model without considering censoring is more likely to underestimate drug efficacy.

In addition, the estimates from the model that incorporates censoring mechanism are more precise in terms of smaller standard error except for logλ. The inter-subject variance components are smaller for the ones based on the model with censoring mechanism, especially for k and ϕ. These results indicate most of parameter estimates are not much different among patients. On the contrary, the T cell production rate has a larger variation ($d λ 2$) among patients when taking censoring into account. These results inform us that the dynamic parameters are not varying much from one subject to another, but each subject do present different health conditions as indexed by self production rate of T cells (λ). Again, these estimates are more precise compared to those from the model without considering censoring, as reflected by smaller standard errors. The variance for measurement error (σ2) is not comparable since in the model with censoring, the censored data are replaced by viral load simulated from the model.We also compared the model fitting from the two models with and without taking censoring for each patient into account. Individual parameter estimates are from the posterior mode of the Gibbs samples. The model fits the data very well. Figure2 depicted the model fitting results of four patients. Clearly, the model not only describes consistently decreasing viral load trajectory (subject 13 and 42), but also rebounds (subject 1 and 12), which is a character of long-term treatment failure and a challenge to short-term models as explained in Section 1. When no below detection limit data are present, i.e. subject 12, the two models behave similarly. Even when censoring data occur, the two models fit the data beyond detection limit approximately the same way. However, without considering censoring, the model does not provide any information below detection limit. After accounting for censoring mechanism, we have a better idea of what is going on underneath the line. For example, in subject 1, viral load rebound has already occurred before it is detectable (solid line below detection limit). In subject 13, viral load keeps decreasing until diminished below detection limit.

Based on the analysis, we conclude that handling below detection data carefully is crucial to obtain more accurate estimates and also correct interpretation of the results.

## 6 Discussion

HIV dynamic model offers an opportunity to study HIV infection mechanism and treatment development from a perspective, other than laboratory experiments. However, research on this area is sparse mainly due to the absence of advanced statistical methods. The issue is further complicated by the lack of closed-form expression of the model and the numerical solution always incurs errors. In contrast to other methods introduced in Section 1, the SAEM algorithm is very efficient and fast, especially in the setting of nonlinear mixed-effects ODE models. Numerical evidence shows that the algorithm converges very fast, usually in a few minutes whereas other algorithms, such as the Bayesian method, seem to be computationally expensive and also slow to converge. Note that the simulation step of SAEM employs a similar MCMC procedure, such as the Gibbs sampling and M-H step as in the Bayesian method. Nevertheless, the MCMC algorithm converges much faster when coupled with the SAEM as compared to the employment in a traditional Bayesian posterior sampling.

One disadvantage of the proposed method is that not all the parameters can be identified due to limitation of data on hand. We may use CD4 T cell counts in the model so that the constraints on some parameters can be released. However, the measurements of CD4 T cells are not reliable so that the use of these data may induce significant bias. The ODE model identifiability is an interesting topic but beyond the scope of this article. Some references for nonlinear ODE model identifiability, including HIV dynamic models, can be found in Jeffrey and Xia (2005), Wu et al. (2008), and Xia and Moog (2003).

In spite of this limitation, we compared the models with and without handling below detection data and found that ignoring these data is very dangerous for interpreting the results. The biological and clinical interpretation of the results accounting for censoring mechanism is inspiring. We hope the study results in this paper may stimulate more research in this area and contribute to the research on identifying the biological mechanism of HIV infection as well as clinical treatment development.

## References

1. Acosta EP, Wu H, Hammer SM, Yu S, Kuritzkes DR, Walawander A, Eron JJ, Fichtenbaum CJ, Pettinelli C, Neath D, Ferguson E, Saah AJ, Gerber JG: Comparison of two indinavir/ritonavir regimens in the treatment of HIV-infected individuals. JAIDS 9: 1358–1366. 2004

2. Besch CL: Compliance in clinical trials. AIDS 9: 1–10. 1995

3. Breslaw JA: Random sampling from a truncated multivariate normal distribution. Appl. Math. Lett 7: 1–6. 1994

4. Davidian M, Giltinan DM: Nonlinear Models for Repeated Measurement Data. Chapman & Hall, London; 1995

5. Delyon B, Lavielle M, Moulines E: Convergence of a stochastic approximation version of the EM algorithm. Ann. Stat 27: 94–128. 1999

6. Dempster AP, Laird NM, Rubin DB: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. B 39: 1–38. 1977

7. Ding AA, Wu H: A comparison study of models and fitting procedures for biphasic viral decay rates in viral dynamic models. Biometrics 56: 16–23. 2000

8. Gelfand AE, Hills SE, Racine-Poon A, Smith AFM: Illustration of Bayesian inference in normal data models using Gibbs sampling. J. Am. Stat. Assoc 85: 972–985. 1990

9. Guedj J, Thiebaut R, Commenges D: Maximum likelihood estimation in dynamical models of HIV. Biometrics 63: 1198–1206. 2007

10. Huang Y, Getachew D: Simultaneous Bayesian inference for longitudinal data with asymmetry, left-censoring and covariates measured with errors. J. Janpanes Stat. Soc 42: 1–22. 2012

11. Huang Y, Liu D, Wu H: Hierarchical Bayesian methods for estimation of parameters in a longitudinal HIV dynamic system. Biometrics 62: 413–423. 2006

12. Hughes J: Mixed-effectss models with censored data with applications to HIV RNA levels. Biometrics 55: 625–629. 1999

13. Jeffrey AM, Xia X: Identifiability of HIV/AIDS model. In Deterministic and Stochastic Models of AIDS Epidemics and HIV Infections with Intervention. Edited by: Tan WY, Wu H. Singapore: World Scientific; 2005

14. Kaps P, Rentrop P: Generalized Runge-Kutta methods of order four with stepsize control for stiff ordinary differential equations. Numer. Math 33: 55–68. 1979

15. Kuhn E, Lavielle M: Coupling a stochastic approximation version of EM with a MCMC procedure. ESAIM: P. S 8: 115–131. 2005

16. Molla A, Korneyeva M, Gao Q, Vasavanonda S, Schipper P, Mo H, Markowitz M, Chernyavskiy T, Niu P, Lyons N, Hsu A, Granneman G, Ho DD, Boucher C, Leonard J, Norbeck D, Kempf D: Ordered accumulation of mutations in HIV protease confers resistance to ritonavir. Nat. Med 2: 760–766. 1996

17. Nowak MA, May RM: Virus Dynamics: Mathematical Principles of Immunology and Virology. Oxford University Press, Oxford; 2000

18. Perelson AS, Nelson PW: Mathematical analysis of HIV-1 dynamics in vivo . SIAM Rev 41(1):3–44. 1999

19. Perelson AS, Neumann AU, Markowitz M, Leonard JM, Ho DD: HIV-1 dynamics in vivo : virion clearance rate, infected cell life-span, and viral generation time. Science 271: 1582–1586. 1996

20. Perelson AS, Essunger P, Cao Y, Vesanen M, Hurley A, Saksela K, Markowitz M, Ho DD: Decay characteristics of HIV-1-infected compartments during combination therapy. Nature 387: 188–191. 1997

21. Stafford MA, Corey L, Cao Y, Daar ES, Ho DD, Perelson AS: Modeling plasma virus concentration during primary HIV infection. J. Theor. Biol 203: 285–301. 2000

22. Wakefield JC: The Bayesian analysis to population Pharmacokinetic models. J. Am. Stat. Assoc 91: 62–75. 1996

23. Wu L: A joint model for nonlinear mixed-effects models with censoring and covariates measured with error, with application toAIDS studies. J. Am. Stat. Assoc 97: 955–964. 2002

24. Wu L: Exact and approximate inferences for nonlinear mixed-effects models with missing covariates. J. Am. Stat. Assoc 99: 700–709. 2004

25. Wu H, Ding AA: Population HIV-1 dynamics in vivo : applicable models and inferential tools for virological data from AIDS clinical trials. Biometrics 55: 410–418. 1999

26. Wu H, Ding AA, de Gruttola V: Estimation of HIV dynamic parameters. Stat. Med 17: 2463–2485. 1998

27. Wu H, Huang Y, Acosta EP, Rosenkaranz S, Kuritzkes D, Eron JJ, Perelson AS, Gerber JG: Modeling long-term HIV dynamics and antiretroviral response: effects of drug potency, pharmacokinetics, adherence and drug resistance. J. Acquir. Immune Defic. Syndr 39: 272–283. 2005

28. Wu H, Zhu H, Miao H, Perelson AS: Parameter identifiability and estimation of HIV/AIDS dynamic models. Bull. Math. Biol 70: 785–799. 2008

29. Xia X, Moog CH: Identifiability of nonlinear systems with applications to HIV/AIDS models. IEEE Trans. Automat. Contr 48: 330–336. 2003

## Acknowledgments

We would like to thank the editor and two anonymous referees whose valuable suggestions led to major improvements in the overall clarity and presentation of this paper.

## Author information

Authors

### Corresponding author

Correspondence to Tao Lu.

### Competing interests

The authors declare that they have no competing interests.

### Authors’ contributions

TL initiated and carried out the study. TL drafted the manuscript. MW participated in the discussion and proofread the article. Both authors read and approved the final manuscript.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions 