 Research
 Open Access
 Published:
Particle swarm based algorithms for finding locally and Bayesian Doptimal designs
Journal of Statistical Distributions and Applications volume 6, Article number: 3 (2019)
Abstract
When a modelbased approach is appropriate, an optimal design can guide how to collect data judiciously for making reliable inference at minimal cost. However, finding optimal designs for a statistical model with several possibly interacting factors can be both theoretically and computationally challenging, and this issue is rarely discussed in the literature. We propose natureinspired metaheuristic algorithms, like particle swarm optimization (PSO) and its variants, to solve such optimization problems. We demonstrate that such techniques, which are easy to implement, can find different types of optimal designs for models with several factors efficiently. To facilitate use of such algorithms, we provide computer codes to generate tailor made optimal designs and evaluate efficiencies of competing designs. As applications, we apply PSO and find Bayesian optimal designs for Exponential models useful in HIV studies and redesign a carrefuelling study for a Logistic model with ten factors and some interacting factors.
Introduction
Statistical models are getting increasingly complex to capture the finer features of a problem. Models incorporate more factors as data becomes highdimensional and heterogeneous. When the model assumptions are tenable, it is important to develop and implement efficient design strategies to realize the most reliable statistical inference at minimal cost.
In the optimal design literature, we typically assume that the statistical model is fully parametrized, known and defined on a userselected design space, apart from the unknown parameters in the model. An optimal design is found under a given criterion and the optimization is usually over all designs in the design space. Frequently, the goal is to estimate one or more parameters, the response surface or a couple of meaningful functions of the model parameters. Unless the model is relatively simple, closed form formulae for the optimal designs rarely exist. Sometimes, additional assumptions are imposed to derive the optimal designs analytically and some of the assumptions may be unrealistic. Further, the bulk of the work in the optimal design literature assumes models have a couple of explanatory factors only and when there are several of them, they are usually assumed to be additive to simplify the derivation. A practical and useful way to handle design problems with many interacting explanatory factors is to develop efficient computational tools that find various types of optimal designs for a broad class of models under realistic assumptions.
We propose a stateoftheart class of algorithms called natureinspired metaheuristic algorithms for solving high dimensional design problems. We call them high dimensional design problems because there are many variables to optimize. Our experience is that traditional design algorithms tend to have problems finding optimal designs when there are several variables to optimize. They are likely to stall at a local optimum or break down because of the huge computational burden when there are many variables to optimize. Several researchers had reported similar experiences with traditional algorithms for finding optimal designs. An early one is Chaloner and Larntz (1989) who found both the (Wynn 1972) and (Fedorov 1972) algorithms very slow when they tried to find A and cBayesian optimal designs for the twoparameter logistic model. They then used the general optimization algorithm proposed by (Nelder and Mead 1965) and found it to be adequate. Similarly, (Broudiscou et al. 1996) claimed that traditional algorithms, such as FedorovWynn types of algorithms or exchange algorithms for finding optimal designs cannot be used to find nonstandard designs, such as asymmetrical Doptimal designs. They found the algorithms performed poorly and difficult to handle and so not effective; they used genetic algorithms instead. Similarly, (Royle 2002) reported that the traditional exchange algorithms are not practical for finding large spatial designs when the criterion is computationally expensive to evaluate or the discretized design space is large. These may be reasons why the bulk of the optimal experimental designs reported in the literature concern a small number of factors.
Natureinspired metaheuristic algorithms, such as particle swarm optimization (PSO) or one of its enhanced versions, such as competitive swarm optimizer (CSO), are more likely to solve optimization problems with a large number of variables to optimize. These algorithms are general purpose optimization tools and by construction, do not require any assumptions on the optimization problem. For example, these algorithms can solve optimization problems when the objective function is nondifferentiable or even when the criterion cannot be written down explicitly. This article describes PSO, its variant CSO briefly and demonstrates their capability for finding optimal designs for a broad class of models with multiple factors, including Bayesian optimal designs.
The next section reviews the optimal design setup and “Particle swarm optimization based algorithms for generating optimal designs” section presents the particle swarm optimization algorithm. In “Websites for finding optimal designs” section, we present websites where codes for finding optimal designs are available and demonstrate with a simple example. In “Optimal designs for high dimensional models” section, we apply CSO to find highdimensional Doptimal designs for Logistic and Poisson models. In “Bayesian optimal designs for biomedical studies” section, we apply PSO to find Bayesian Doptimal designs for Exponential models useful in HIV studies. We then conclude with a discussion on future work and a cautionary remark on use of optimal designs in practice.
Background
The statistical model of interest has the form
where f(x,θ) is the mean response of the univariate response y and assumed to be known, apart from the m×1 vector θ of model parameters. There are p possibly interacting factors in the model and so X is pdimensional. Given a design criterion and a predetermined n of independent observations to take for the study, the design questions are the optimal number (k) of design points to take from X, the optimal locations x_{1},…,x_{k}’s in X to observe the responses, and the optimal proportion (w_{i}) of observations to take at x_{i},i=1,…,k. This results in an approximate design and is implemented by first rounding each nw_{i} to the nearest positive integer \(\left [nw_{i}^{*}\right ]\) and subject to the constraint that \(\left [nw_{1}^{*}\right ]+\left [nw_{2}^{*}\right ]+\dots +\left [nw_{k}^{*}\right ]=n\). There is a theoretical framework for finding optimal approximate designs, including established algorithms for finding many of them and evaluating the proximity of a design to the optimum design even when the latter is unknown; for details on calculating the efficiency lower bound, see (Kiefer et al. 1985).
Following convention, the worth of a design ξ is gauged by its Fisher information matrix. For nonlinear models, the information matrix depends on the unknown values of the parameters θ and we denote this matrix by M(ξ,θ). The design criterion is then expressed as a function of this matrix and as a first step, we typically replace the unknown θ in the matrix by its nominal value, θ_{0}. The resulting optimal design is called locally optimal design since it depends on θ_{0}, which may come from a pilot study or from previous similar studies (Chernoff 1972).
The Doptimal design for estimating all model parameters using θ_{0} as nominal values is the design ξ_{D} that minimizes the negative of the logdeterminant of information matrix
where Ξ is the set of all designs on X. The smaller the criterion value is, the better is the design. When prior information on the model parameter θ is available in the form of a density π(θ), a Bayesian Doptimal design ξ_{BayesD} minimizes the same criterion after averaging out the model parameters with respect to the prior density. It is defined by
and as before, the smaller the Bayesian Doptimality criterion value is, the better is the design. Clearly, when the prior density is degenerate, the resulting design becomes locally Doptimal. Both criteria are appropriate for estimating model parameters.
When the design criterion is convex in Ξ, as in the above two cases, an equivalence theorem is available to verify optimality of a design over all designs on X. Such a theorem comes from directional derivative considerations of a convex functional and is discussed in design monographs like (Fedorov 1972; Berger and Wong 2005). For example, if m is the dimension of θ and δ_{x} is the design that takes all observations at x, it can be shown that the design ξ_{D} is locally Doptimal designs if and only if
with equality at the support points of ξ_{D}. The function on the left is sometimes called the sensitivity function. Different convex design criteria lead to different sensitivity functions but they all have a similar form. In practice, the optimality of a design is verified by plotting the sensitivity function against the design space (equivalence plot) and checking whether the equivalence theorem is satisfied. If it is not, the design is not Doptimal. Similarly, ξ_{BayesD} is Bayesian Doptimal with respect to the prior density π(θ) if and only if
with equality at the support points of ξ_{BayesD}.
We compare designs using relative efficiency, which is commonly defined as the ratio of the two criteria values, or a function thereof. When one of the designs been compared is the optimal design, relative efficiency becomes the design efficiency. Specifically, for Doptimality, we compare two designs ξ_{1} and ξ_{2} with nominal values θ_{0} via the m^{th} root of the ratio of the determinants of their information matrices:
The relative efficiency ratio compares performance of the two designs for estimating the model parameters. If the above ratio 0.5 or 50% efficiency, this means that the design ξ_{1} needs twice as many observations for it to do as well as the design ξ_{2}. When ξ_{2} is the Doptimal design, the above ratio is simply the Defficiency of the design ξ_{1}.
The next section describes a natureinspired metaheuristic algorithm and one of its variants for finding Doptimal designs for the Poisson regression models, Bayesian Doptimal designs for Exponential models and Doptimal designs for highdimensional Logistic and Poisson models.
Particle swarm optimization based algorithms for generating optimal designs
Metaheuristic algorithms are increasingly common, and a key appeal of such algorithms is that there is no or minimum assumptions required for them to work well. Metaheuristic algorithms usually involve some randomization and local searches. In particular, they go through slightly different processes and end up with frequently not too different results (Yang 2010).
We focus on particle swarm optimization (PSO), which is a member of the class of metaheuristic algorithms. It is now widely and routinely used in the engineering field. PSO was first developed in 1995 by Eberhart and Kennedy (1995). Motivated by swarm intelligence, they simulated candidate designs for the optimum using them as birds in a swarm looking for food on the ground. The swarm collectively acts and communicates to update where each bird believes where the food is (personal best) and flies toward it in the direction of the group best, which is where the flock believes the food lies after sharing information within the flock. The objective function gets updated at each iteration as the bird flies over time in search of a quality solution. Many have reported that PSO can significantly outperform genetic algorithms (GA) in terms of number of function evaluations required (Hassan et al. 2005).
To initiate the PSO algorithm, the user first selects a swarm size where each particle in the swarm represents a randomly generated candidate design. Below is the pseudocode for PSO (Kennedy 2006):
Begin
Initialize particle position and velocity
While maximum iterations or minimum error criteria is not attained
Do
For each particle
Evaluate objective function
Update particle personal best
End
Set particle with the best objective function value as the group best
For each particle
Update particle velocity: \(v^{t+1}_{id}=wv^{t}_{id}+c_{1}\psi _{1}\left (p^{t}_{id}x^{t}_{id}\right)+c_{2}\psi _{2}\left (p^{t}_{gd}x^{t}_{id}\right)\)
Update particle position: \(x^{t+1}_{id}=x^{t}_{id}+v^{t+1}_{id}\)
End
End
In the pseudocode, w is inertia weight, \(v^{t}_{id}\) and \(x^{t}_{id}\) are velocity and position of d^{th} dimension of i^{th} particle at iteration step t, c_{1}, c_{2} are weight constants, ψ_{1} and ψ_{2} are random values from uniform [0,1] distribution, p_{id} is the personal best of d^{th} dimension of particle i (the best position particle i ever visited), p_{gd} is the group best of d^{th} dimension of the swarm (the best position the group ever visited). Each particle represents a candidate design with k support points, and k (≥m) is user selected. The dimension of each particle is therefore 2k−1 because we need to optimize the locations of the k support points and their corresponding weights. The dimension is one smaller since the weights sum to unity.
For solving highdimensional optimization problems, it is helpful to use a variant of PSO because research shows that PSO can be prone to premature convergence (Yang and Pedersen 1997). This means that particles can quickly converge to a local optimum without enough space exploration. This phenomenon decreases the quality of the solution provided by PSO. Many strategies have been proposed to alleviate such premature convergence, and one successful PSO variant that shows great potential for solving complex optimization problems is the competitive swarm optimizer (CSO) algorithm (Cheng and Jin 2015). The researchers found that PSO premature convergence has strong connection with particle personal best and swarm group best, which seem to have too much influence on the convergence of each particle. They proposed CSO by removing those two “black holes" (namely, personal best and group best) and recast the updating formulas. Further, CSO pairs up particles and let the “loser” (particle with inferior objective function value) to learn from “winner” (particle with superior objective function value). The main change compared to PSO is that the updating mechanism for the “loser” particle velocity becomes
where \(v^{t}_{id}\) and \(x^{t}_{id}\) are velocity and position of d^{th} dimension of i^{th} particle at iteration step t, \(x^{t}_{jd}\) is the position of d^{th} dimension of the paired j^{th} particle at iteration step t, ϕ is a tuning parameter, ψ_{1}, ψ_{2}, ψ_{3} are random values from uniform [0,1] distribution, \(\bar {x}_{d}^{t}\) is the d^{th} dimension of swarm center at iteration t. CSO makes the swarm more diverse without increasing much computational cost and shows that it is less likely to be trapped in local optimum (Cheng and Jin 2015).
We now show how PSO can find different types of optimal designs effectively for different types of generalized linear models. The examples are meant to be illustrative with some details for those new to the area; others may use our codes directly from the following websites that contain codes for finding optimal designs for more complicated situations.
Websites for finding optimal designs
There are websites with MATLAB PSO codes that we have written for finding various optimal designs for commonly used models. They include http://wkwong.bol.ucla.edu/podpack/index.html, http://www.math.ntu.edu.tw/~optdesign/and http://www.stat.ncku.edu.tw/optdesign/. Each code is for a specific design problem for a particular model. The user inputs the required information for their design problem and the PSO code searches iteratively for the optimum.
The available codes on our website find optimal designs for different models under different criteria. Models include commonly used linear, MichaelisMenten, mixture polynomial, logistic, compartmental, Hill’s, doubleexponential, exponential, Poisson, etc. Criteria include D, D_{s}, A, G, E, minimax, etc.
The aim in many toxicity studies is to ascertain the joint toxicity effects from several toxicants on the number of organisms or cells that survive when we apply different dose combinations of the toxicants. There is limited work to address design issues for such studies and when they are available, they usually have one or two explanatory variables in the model (Russell et al. 2009; Wang et al. 2006; Qiu 2014). To fix ideas, we use a Poisson model with two toxicants to illustrate how our sites facilitate search for a locally Doptimal design for a Poisson regression model. The website has codes that are able to find designs with more interacting toxicants.
Let y_{i} be the observed number of organisms or cells that survive when we apply the i^{th} dose combination of the two interacting toxicants x_{i}=(x_{i1},x_{i2})^{T}. Let the mean response of the Poisson regression model be λ_{i}, which is the same as its variance Var(y_{i}x_{i}). Further, we assume the mean structure in our statistical model is
where θ^{T}=(θ_{0},θ_{1},θ_{2},θ_{12}) and f^{T}(x_{i})=(1,x_{i1},x_{i2},x_{i1}x_{i2}). The Fisher information for a design with k support points ξ=(x_{1},…,x_{k}; w_{1},…,w_{k}) with Poisson rate λ_{i}, i=1,2,…,k has the form
where F=(f(x_{1}),…,f(x_{k}))^{T} and W=diag(w_{1}λ_{1},…,w_{k}λ_{k}). We consider a synergism effect in this example, and set the nominal values for θ_{0}, θ_{1}, θ_{2} and the interaction term θ_{12} to be 0.1, 0.1, 0.1, and 0.01. The values are nonpositive because we expect the effect of each toxicant is such that fewer cells will survive when the dose of the toxicant is increased. We also set the nominal value of θ_{12} to be smaller than the additive effects, which is usually the case in practice (Wang et al. 2006). Another restriction on the feasible design region representing doses or concentrations of the toxicants is that their values are nonnegative.
Our environment for this demonstration uses a x86_64win64 (64 bit) 3.6GHz, 16GB RAM computer with Intel i74790 CPU on Windows 10 Enterprise OS. The version of MATLAB we used is R2015b. We first download the codes from “Part H1: Locally Doptimal design for Poisson regression model with M = 2” from the website. Upon typing “run” in the command window, a Graphic User Interface (GUI) pops up as is shown in Fig. 1, whereupon the user inputs the nominal values of parameters, the anticipated number of support points for the optimal design and parameters for the PSO algorithm. The user can change the tuning parameters in PSO or change the nominal values for the parameters. We assume there are 4 support points for the optimal design (minimally supported). For our demonstration, we set PSO tunning parameters as c_{1}=c_{2}=2, w linearly decreasing from 0.9 to 0.4 (Shi and Eberhart 1999) and use 100 particles and 1000 iterations as tuning parameters for PSO. When the “Run!” button is clicked, the program runs and the search begins. The design found by the algorithm is displayed in the command window, as is shown in Fig. 2, with the dose levels of toxicant 1, toxicant 2 and their corresponding weights. We observe that the four support points are equally supported at (0.3, 1), (0.368, 0.368), (1, 0.3), and (1, 1). Additionally, the output displays the Defficiency lower bound and the criterion value of the logdeterminant of the Fisher information matrix. The generated design has a Defficiency lower bound of 1, confirming that the design is locally Doptimal. The subwindow of Fig. 1 shows the sensitivity function plot of the PSOgenerated design and visually also confirms optimality of the generated design among all designs. In this experiment we observe that the the criterion values become stable after about 100 iterations. This is typical of PSO where it tends to get to the proximity of the optimum quickly and afterwards exploits the locality to determine the optimum.
Optimal designs for high dimensional models
In practice, models are likely to have several explanatory factors. This is because a few explanatory factors may not capture the complex structure of the full data adequately. This section shows that CSO, a variant of PSO can tackle highdimensional optimal design problems for the Logistic and Poisson models.
Locally Doptimal designs for 5factor Logistic and Poisson regression models
Our models are generalized linear models and to fix ideas, consider the more popular Logistic model and a Poisson model, each with five explanatory factors and all pairwise interactions. The Logistic model is given by
where the outcome y is Bernoullidistributed, and each factor x_{i} resides in the design space [−1,1]. For the Poisson model, its mean response is λ_{i}, which is the same as its variance Var(y_{i}x_{i}). In terms of the explanatory factors, the mean structure is
We expect the locally Doptimal design for each of the above models has at least k≥16 design points because there are 16 parameters in both models. This means that we have k−1 weights to determine and at least k≥16 design points to determine, implying the total number of variables we need to optimize in this problem is at least 95. If k=25, for instance, this number becomes 149 and so the problem becomes highdimensional rapidly. In the event that the Doptimal design has k=16=m support points, we have a minimallysupported design.
As always, the choice of the tuning parameters in an evolutionary algorithm deserves attention. For the hard highdimensional problems in this paper, we used 200 particles. The values for the other parameters we used were suggested by (Cheng and Jin 2015); in particular, we set ϕ=0.05 in CSO. We stop the algorithm when the change in the values of objective function from successive iterations is smaller than 10^{−5}.
We implemented PSO and Genetic Algorithm (GA) and compared their performance with CSO for searching Doptimal designs for high dimensional models. The choices for the tuning parameters in PSO were w=0.9 and c_{1}=c_{2}=2 (Shi and Eberhart 1998). The tuning parameter EliteCount in Genetic Algorithm was 0.05, which is recommended by the Matlab official implementation of the code. The swarm size of PSO and GA was also 200. Since evolutionary algorithms are stochastic and produce slightly different result for each run, we ran the algorithm five times for each model and averaged the outputs.
For simulation purposes, parameters for Logistic models were generated randomly from uniform [1, 1] and parameters for Poisson models were generated randomly from uniform [3, 3]. These nominal values are listed in Table 1. The design space is [−1,1]^{5}.
Table 2 displays the Doptimality criterion values, averaged over five runs, obtained by the three algorithms for the four models 81, 82, 91 and 92, along with their standard deviations in parentheses. Among the three algorithms, CSO consistently has better and more stable performance for searching Doptimal designs for the four simulated models. The equivalence theorem can be used to verify optimality of the CSOgenerated designs but the highdimensional sensitivity function plots are not necessarily easy to construct and interpret visually. A more practical way is to determine the maximum of the sensitivity function of the generated design across the design space and compute its efficiency lower bound. For our examples, the average efficiency lower bounds of the generated designs for the four models are 93%, 96%, 99% and 99%, suggesting that the generated designs are highly Defficient. We also observe that the average runtime for each algorithm shown at the bottom of the table confirms CSO not only produces the best quality solutions but also does so most efficiently.
As an example, we display at least 99% Defficient design found by CSO for model 92 in Table 3. The first five columns show the support points and the last column shows the weight associated with each design point.
We verify the optimality of this and other designs by checking the values of sensitivity function over a userselected discretized grid set in the design space. Clearly for models with several explanatory factors, the finer the grid set, the longer time it takes to check optimality. It is helpful to start with some initial testings with a rough grid to determine whether there are violations of the equivalence theorem; if there are, this suggests the design is not optimal and another should be generated. For this particular example, after initial testings, we discretized the search space with a total of (2/0.2+1)^{5}=161051 grid points for this 5factor model, i.e. a step size of 0.2 for each of the factor spaces. We then plot the multidimensional sensitivity function over the grid set, which is now much harder to visualize and appreciate its properties than the case when there is only one explanatory factor. One option is to stretch the highdimensional grid into a onedimensional vector on the xaxis and plot the sensitivity function values along the xaxis.
Figure 3 is an example of such a plot where it shows the graph of the sensitivity function of the design in Table 3. The plot shows that there are many zero points in the graph. Such discrepancy can be explained from several perspectives: (1) the design is very close to the true optimal design, but still not the true one; (2) when plotting, we systematically chose points from the highdimensional space. One theoretical design point might be spatially close to more than one points in the grid we made; (3) although values at some points seem to achieve 0, if we amplify the graph, we may find out that they are not 0 points.
A realworld application on car refueling experiment
In this section, we describe a realworld application which tries to find a highdimensional optimal design for a 10factor Logistic model. Grimshaw et al. (2001) described an experiment for testing a visionbased car refueling system with the question that whether a computercontrolled nozzle was able to insert itself into the gas pipe correctly or not. The experiment has four binary explanatory factors taking values 1 or 1: ring type (x_{1}, white paper or reflective), lighting (x_{2}, room lighting or 2 flood lights and room lights), sharpening (x_{3}, without or with), smoothing (x_{4}, without or with); also included are six continuous factors: lightning angle (x_{5}, 50 to 90 degrees), gascap angle 1 (x_{6}, 30 to 55 degrees), gascap angle 2 (x_{7}, 0 to 10 degrees), can distance (x_{8}, 18 to 48 inches), reflective ring thickness (x_{9}, 0.125 to 0.425 inches) and threshold step vale (x_{10}, 5 to 15). Lukemire et al. (2019) employed a variant of PSO called quantum PSO to search for a locally Doptimal design for the 10factor additive Logistic model and reported an average runtime of 140 seconds. Here we include some two and threeway interaction terms. All these terms are summarized in table 4. The model contains 10 factors and 16 parameters in total. Here, we started the search with a swarm, each with 20 support points. Finding optimal design for this problem is a highdimensional problem: for each solution, there is (10+1) × 20=220 dimensions, as each design point has 10 factors and 1 corresponding weights, and there are 20 support points to be optimized.
To find the locally Doptimal design, a set of parameter values are proposed: θ^{T} = (3.0, 0.5, 0.75, 1.25, 0.8, 0.5, 0.8, 0.4, 1.00, 2.65, 0.65, 1.1, 0.2, 0.9, 0.36, 1.07). We used 200 CSO particles to search the design space; we ran the simulation for 10 times in order to obtain the averaged runtime. We first tested the algorithm on the additive model used in (Lukemire et al. 2019), and CSO spent 24 seconds on average to find the optimal design, which is shown in Table 5 and the optimal criterion value is 35.95. This confirms that CSO has superior capability to find the optimal design efficiently. When testing CSO on the model with interaction terms, we found a design with an efficiency lower bound of 99% efficient using around 957 seconds. These efficiency was calculated on 1882384 sampled grid points uniformly drawn from the design space. Table 6 displays the 17point locally Doptimal design. Its weight distribution on these points is in the last column and the criterion value for this design is 7.2562.
Bayesian optimal designs for biomedical studies
Bayesian optimal designs incorporate prior knowledge of the model parameters at the design stage. The prior knowledge usually comes in the form of a probability density function for the parameters and is averaged out by numerical integration before an optimization scheme is applied to find the optimal design. Because the integration and optimization spaces can be very different objects, each with varying magnitude, finding a Bayesian optimal design in a high dimensional problem can be very challenging. Here, we show that PSO is a promising tool for finding Bayesian Doptimal designs for Exponential models which are commonly used in pharmacokinetic/pharmacodynamic studies.
In HIV studies, Exponential models are frequently used to characterize viral load changes with time after administration of a potent inhibitor of HIV1 protease in vivo (Perelson et al. 1996). Derived from a series of ordinary differential equations that describes the virus change in different compartments, the model is a good representation of longitudinal HIV dynamics. The important parameters in such a model include virus clearance rate and infected cell life span (Wu and Ding 1999). Some analytic locally optimal designs and Bayesian optimal designs are available for the Exponential models (Han and Chaloner 2003; Dette and Neugebauer 1997).
This section describes how to use prior information to study drug effect and understand the longitudinal viral dynamics. A well thought out design to draw plasma samples from patients to measure the HIV1 RNA copies is essential for estimating the model parameters accurately. Han (2003)(Han and Chaloner 2003) provides some simple but useful models:
and
Here Y_{j} is the viral load at time t_{j} and the sampling times are t_{j}∈[t_{min},t_{max}]⊆ [0,60]. Both t_{min} and t_{max} are preselected and refer, respectively, to the minimum and maximum time where observations can be taken. We have ε_{j}∼iidN(0,σ^{2}) and the model parameters we wish to estimate are P_{0}, P_{1} and δ (all > 0). Following (Han and Chaloner 2003), the prior densities for P_{0}, P_{1} are both uniform [0.5, 1.5], and for δ is uniform [0.9, 1.1]. Additionally we use a different specification for model 12 with the usual P_{0} and P_{1} but with δ ∼ uniform [0, 0.2]. We call the former specification model 121 and the latter model 122 and try to compare the properties of Bayesian optimal designs under different specifications. The priors can be quite flexible and they do not have to be independent. The above three models are simplified Exponential regression models with two to three parameters, and one can easily extend to the models with more parameters based on the disease stages. Specifically, model 12 describes the trajectory of plasma HIV RNA level under antiviral treatment (Wu and Ding 1999); model 10 and 11 are special cases when P_{0} is treated as a nuisance parameter.
Similarly, we set PSO tunning parameters as c_{1}=c_{2}=2, w=0.9 (Shi and Eberhart 1998) and use 40 particles and 1000 iterations. We use 1000 Monte Carlo samples to compute the numerical integral in the objective function (Eq. 2). Other numerical integration techniques such as Gaussian quadrature or sparse grid can be used to confirm the accuracy of the numerical integration. Table 7 provides the PSO found designs for different models.
The sensitivity function plots of the four PSO generated designs are shown in Figs. 4 and 5, below and they all confirm that the designs are Bayesian optimal. The plots suggest that the design remains optimal if t=14 instead of t_{max}. We note that changing prior specifications considerably changes the optimal design. For example, when we compare model 122 with model 121, we observe that the locations of the support points and the number of support points also change. The optimal design for model 122 requires 4 support points compared to 3 support points for model 121. Moreover, the optimal design support points for model 122 are more dispersed, and include middle and end time points, whereas the optimal design support points for model 121 are concentrated in the earlier time. As is shown in Fig. 6, the differences in designs might be attributed to the fact that larger δ value in Eq. 12 flatten out responses quickly, and design points drawn at earlier time points are critical in providing insights on the exponential decay rate. On the contrary, the small δ value response curve decays smoothly, and later time points are still on the decreasing curve so that more information can be obtained by allocating design points sparsely.
Discussion
Constructing efficient designs is critical to best use the data to reach reliable estimations. One potential issue in constructing highdimensional or Bayesian designs is that the computational time will increase when the model becomes large and the number of the design points increases. Even for a simple linear additive model with 20 factors and 21 design points, the dimension of the optimal design problem is 20×21+20=440. To solve this constrained optimization problem, PSO has to optimize 440 variables. Consequently there is high computational costs. Fortunately, parallel computing techniques can be applied to accelerate the computations. Hung and Wang (2012) proposed a GPUaccelerated PSO (GPSO) algorithm by using a thread pool model and implemented GPSO on a Graphic Processing Unit (GPU). The authors demonstrated that the proposed GPSO can significantly reduce the computational burden with satisfactory parallel efficiency. Likewise, (Chen et al. 2013) proposed a discrete PSO approach, named LaPSO, to search for an optimal Latin hypercube design. The authors accelerated LaPSO by using GPU and showed that the GPU implementation can save computational time significantly for large optimization problems. We expect that the programs in this article can also be accelerated on parallel computers such as GPU. The parallelization will likely produce computing tools with faster response time and better user experience.
There are continuing challenging problems in our work. We do not claim we are able to find all types of optimal designs in a regression setup using the types of algorithms we proposed here. We point out a few of these problems:

we have experiences with metaheuristic algorithms that work well for some nominal values for a model but not for other nominal values; similarly, the same algorithm may not work well when the design space is changed, and especially when it is enlarged. These are likely scaling problems that our current work is trying to address and understand.

confirming optimality of a generated design remains a challenge because it is difficult to appreciate interesting features in a high dimensional plot. An alternative is to find its efficiency lower bound, which amounts to solving another highdimensional optimization problem to find the maximal value of the sensitivity function. This means that the metaheuristic algorithm has to be applied twice or find another more appropriate metaheuristic algorithm to solve this second optimization problem.

as the number of explanatory factors in the model increases, so is the number of variables we need to optimize. For example, in the 10factor carrefueling example, CSO fails to find a locally Doptimal design when the model includes all twofactor interaction terms and some threefactor interaction terms. Currently, the best design found by CSO seems to have a Defficiency of about 82%. Clearly, some further enhancements of the metaheuristic algorithms may be needed. Hybridization to combine one or two more algorithms with CSO may also improve performance.
Summary
Our main contributions in this paper are 1) use PSO and its variants to find optimal designs for highdimensional and Bayesian models; and 2) the creation of online tools for practitioners to generate different types of tailor made optimal designs for their problems. Web based tools can be very valuable to help practitioners make informed decisions on the study design. For instance, a successful website is the one housed in Houston at https://biostatistics.mdanderson.org/SoftwareDownload/, where an array of software is available for download to find many types of adaptive Bayesian designs for Phase I and II trials. It has more than 17,000 downloads to date indicating a high demand of such tools in practice. Some of our PSO codes for stand alone programs are in MATLAB, where the user can download to make changes, when necessary, for their problems. This website allows practitioners to compare designs and arrive at an informed decision on the choice of the design to implement.
We have applied PSO and its variants to search for Bayesian optimal designs and highdimensional models. These are challenging tasks as they involve scaling problems and multiple integration over different types of parameter spaces. While such algorithms do not perform integration per se, they can be cleverly hybridized with effective tools for integration purposes to find these hard to find Bayesian optimal designs. Our current work includes hybridizing PSO or its variants with sparse grid algorithm and results are promising.
We close with a cautionary note that an optimal design should not be used religiously but should serve as a guide or benchmark. This is because the optimal design is found under a fixed set of assumptions that may not adequately reflect reality and so may not satisfy the needs of the practitioners. Different optimal designs under various settings should be compared carefully before the design is implemented. The guiding principle is that the implemented design should not stray too far from the optimum as measured by its efficiency relative to the optimum. PSO facilitates search for an efficient design, calculates an efficiency lower bound and compares competing designs. Our hope is that the practitioners are more informed of such algorithms and the availability of them on websites will help them implement more efficient designs.
References
Berger, M. P., Wong, W. K.: Applied Optimal Designs. John Wiley & Sons, England (2005).
Broudiscou, A., Leardi, R., PhanTanLuu, R.: Genetic algorithm as a tool for selection of Doptimal design. Chemometr. Intell. Lab. Syst. 35, 105–116 (1996).
Chaloner, K., Larntz, K.: Optimal bayesian design applied to logistic regression experiments. J. Stat. Plan. Infer. 21, 191–208 (1989).
Chen, R. B., Hsieh, D. N., Hung, Y., Wang, W.: Optimizing latin hypercube designs by particle swarm. Stat. Comput. 23(5), 663–676 (2013).
Cheng, R., Jin, Y.: A competitive swarm optimizer for large scale optimization. IEEE Trans. Cybern. 45(2), 191–204 (2015).
Chernoff, H.: Sequential Analysis and Optimal Design. SIAM, Philadelphia (1972).
Dette, H., Neugebauer, H. M.: Bayesian doptimal designs for exponential regression models. J. Stat. Plan. Infer. 60(2), 331–349 (1997).
Eberhart, R., Kennedy, J.: A new optimizer using particle swarm theory, MHS’95. In: Proceedings of the Sixth International Symposium on Micro Machine and Human Science, pp. 39–43. IEEE, Nagoya (1995).
Fedorov, V.V.: Theory of Optimal Experiments. Elsevier, New York (1972).
Grimshaw, S. D., Collings, B. J., Larsen, W. A., Hurt, C. R.: Eliciting factor importance in a designed experiment. Technometrics. 43(2), 133–146 (2001).
Han, C., Chaloner, K.: Dand coptimal designs for exponential regression models used in viral dynamics and other applications. J. Stat. Plan. Infer. 115(2), 585–601 (2003).
Hassan, R., Cohanim, B., De Weck, O., Venter, G.: A comparison of particle swarm optimization and the genetic algorithm. In: 46th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference, p. 1897 (2005).
Hung, Y., Wang, W.: Accelerating parallel particle swarm optimization via gpu. Optim. Methods Softw. 27(1), 33–51 (2012).
Kennedy, J.: Handbook of natureinspired and innovative computing. Swarm intelligence. Springer, Boston (2006).
Kiefer, J. C., Brown, L., Olkin, I., Sacks, J.: Jack Carl Kiefer Collected Papers: Design of Experiments. Springer, New York (1985).
Lukemire, J., Mandal, A., Wong, W. K.: dqpso: A quantumbehaved particle swarm technique for finding doptimal designs with discrete and continuous factors and a binary response. Technometrics. 61(1), 77–87 (2019).
Nelder, J. A., Mead, R.: A simplex method for function minimization. Comput. J. 7, 308–313 (1965).
Perelson, A. S., Neumann, A. U., Markowitz, M., Leonard, J. M., Ho, D. D.: Hiv1 dynamics in vivo: virion clearance rate, infected cell lifespan, and viral generation time. Science. 271(5255), 1582 (1996).
Qiu, J.: Finding optimal experimental designs for models in biomedical studies via particle swarm optimization. PhD thesis, UCLA (2014). https://escholarship.org/uc/item/1cj4b854.
Royle, J. A.: Exchange algorithms for constructing large spatial designs. J. Stat. Plan. Infer. 100, 121–134 (2002).
Russell, K. G., Woods, D. C., Lewis, S., Eccleston, J.: Doptimal designs for poisson regression models. Stat. Sin.19, 721–730 (2009).
Shi, Y., Eberhart, R. C.: Parameter selection in particle swarm optimization. In: International Conference on Evolutionary Programming, pp. 591–600. Springer, Berlin (1998).
Shi, Y., Eberhart, R. C.: Empirical study of particle swarm optimization. In: Proceedings of the 1999 Congress on Evolutionary ComputationCEC99 (Cat. No. 99TH8406) Vol. 3, pp. 1945–1950. IEEE, Washington, DC (1999).
Wang, Y., Myers, R. H., Smith, E. P., Ye, K.: Doptimal designs for poisson regression models. J. Stat. Plan. Infer. 136(8), 2831–2845 (2006).
Wu, H., Ding, A. A.: Population hiv1 dynamics in vivo: Applicable models and inferential tools for virological data from aids clinical trials. Biometrics. 55(2), 410–418 (1999).
Wynn, H. P.: Results in the theory and construction of doptimum experimental design. J. R. Stat. Soc. Ser. B. 34, 133–147 (1972).
Yang, X. S.: Natureinspired Metaheuristic Algorithms. Luniver press, United Kingdom (2010).
Yang, Y., Pedersen, J. O.: A comparative study on feature selection in text categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 412–420. Morgan Kaufmann Publishers Inc., San Francisco (1997).
Acknowledgement
The authors would like to thank Dr. RayBing Chen and Dr. Weichung Wang for their support in maintaining the website.
Funding
The research in this publication were partially supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Number R01GM107639. The funding did not influence the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Availability of data and materials
Not applicable.
Author information
Affiliations
Contributions
YS wrote the manuscript and provided PSOgenerated designs, ZZ applied CSO and generated optimal designs for highdimensional models and WKW supervised and edited the whole manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Yu Shi.
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Bayesian design
 Design efficiency
 Generalized linear model
 Metaheuristic algorithms