# Meta analysis of binary data with excessive zeros in two-arm trials

## Abstract

We present a novel Bayesian approach to random effects meta analysis of binary data with excessive zeros in two-arm trials. We discuss the development of likelihood accounting for excessive zeros, the prior, and the posterior distributions of parameters of interest. Dirichlet process prior is used to account for the heterogeneity among studies. A zero inflated binomial model with excessive zero parameters were used to account for excessive zeros in treatment and control arms. We then define a modified unconditional odds ratio accounting for excessive zeros in two arms. The Bayesian inference is carried out using Markov chain Monte Carlo (MCMC) sampling techniques. We illustrate the approach using data available in published literature on myocardial infarction and death from cardiovascular causes. Bayesian approaches presented here use all the data, including the studies with zero events and capture heterogeneity among study effects, and produce interpretable estimates of overall and study-level odds-ratios, over the commonly used frequentist’s approaches. Results from the data analysis and the model selection also indicate that the proposed Bayesian method, while accounting for zero events, adjusts for excessive zeros and provides better fit to the data resulting in the estimates of overall odds-ratio and study-level odds-ratios that are based on the totality of the information.

## Introduction

An arm is a standard term for describing clinical trial and it represents a treatment group or a set of subjects. A two-arm study compares a drug with a placebo or drug A with drug B. Sometimes in these studies, the outcome may be binary. A binary outcome is an outcome whose unit can take on only two possible states “0" and “1". For example, outcomes of clinical trials data such as the morbidity and mortality studies are often binary in nature.

The natural distribution for modeling these types of binary data is the binomial distribution given by

$$\begin{array}{@{}rcl@{}} f(y; p) = {{n}\choose{y}} p^{y}(1-p)^{n-y} \ \ \text{for} \ \ y=0,1, \dots,n, \ p \in (0, 1). \end{array}$$

The mean and variance for the binomial random variable are E(Y)=np and Var(Y)=np(1−p) respectively. In a two-arm trial with binary outcomes, it is typically assumed that $$Y_{T_{1}},...,Y_{T_{k}}$$ and $$Y_{C_{1}},...,Y_{C_{k}}$$ are random samples from $$Y_{T_{i}} \sim Bin\left (n_{T_{i}},P_{T_{i}}\right)$$ and $$Y_{C_{i}} \sim Bin\left (n_{C_{i}},P_{C_{i}}\right)$$ respectively, where k is the number of studies. In a random effects meta analysis of these types of data, the effect size is assumed to vary from study to study. Random effects meta analysis assumes that study effects are a random sample from an underlying relevant distribution of effects, and the combined effect estimates the mean effect of this distribution.

There are a variety of different approaches to analyze these types of data as indicated by some recent literature. See Albert (1995) for various parametrization of binomial models for discrete data within Bayesian settings. Chang et al. (2001) use a mixed effects model to investigate between and within-study variation using rate difference and logit models. Gamalo et al. (2011) propose a Bayesian procedure for testing noninferiority in two-arm studies with a binary primary endpoint that allows the incorporation of historical data on an active control via the use of informative priors but did not consider excessive zeros. Carlin (1992) consider a Bayesian meta-analysis approach for two way contingency table data while Smith et al. (1995) discuss how a full Bayesian analysis can be used to deal with issues in meta-analysis in a natural way using the BUGS language. In this paper, we consider a Bayesian approach for binary data with excessive zeros in two-arm trials. More specifically, we model the excessive zeros using zero inflated binomial distribution and use the Dirichlet process Ferguson (1974) to handle the heterogeneity among studies. There are various zero inflated methods available in the literature. Hall (2000) introduced the framework for count data with many zeros using Poisson and binomial models and likelihood ratio tests based inference for zero inflated Poisson models are discussed in Huang et al. (2014). A Bayesian inference framework for zero inflated Poisson regression models is discussed in Ghosh et al. (2006). A rich class of nonparametric Bayesian priors for study effects and Bayesian nonparametric Polya tree mixture model are developed in Branscum and Hanson (2008) and Burr and Doss (2005).

In Section 2, we describe Bayesian model specification used in the paper. The likelihood function and the priors are described. Study effects have a Dirichlet process prior distribution for capturing heterogeneity among studies. We then obtain posterior summary statistics which describe key features in the model. In particular, posterior expectations are approximated through Markov chain Monte Carlo (MCMC) methods. In Section 3, the model is applied to a large dataset available in the literature Nissen and Wolski (2007). We perform the model selection using the log-pseudo marginal likelihood (LPML) comparing the Binomial and zero-inflated Binomial (ZIB). The results suggest that when the data has a high percentage of observed zeros, ZIB model is a more appropriate model to use. Furthermore, the use of Dirichlet process has advantage over the more commonly used random effects model with normally distributed random effects based on DerSimonian-Laird approach DerSimonian and Laird (1986) or a Bayesian approach using normal priors, in terms of its inherent clustering property resulting in the studies with similar effects to cluster, and thus providing more robust estimates. We also test the approach using simulation studies in Section 4 and study the effect of excessive zeros in the ZIB models. We conclude with a short discussion in Section 5.

## Model development

Consider two-arm trials with binary outcomes and let $$Y_{T_{i}} \overset {\text {ind}}{\sim } Bin\left (n_{T_{i}}, P_{T_{i}}\right)$$ and $$Y_{C_{i}} \overset {\text {ind}}{\sim } Bin\left (n_{C_{i}}, P_{C_{i}}\right), i=1,\ldots,k,$$ where k is the number of studies. Then the joint likelihood $$L=L\left (y_{T_{1}},\ldots,y_{T_{k}},y_{C_{1}},\ldots,y_{C_{k}}|\mu,P_{T},P_{C}\right)$$ is

$$\begin{array}{@{}rcl@{}} L={\prod}^{k}_{i=1}\left\{{~}^{n_{T_{i}}}C_{y_{T_{i}}}P^{y_{T_{i}}}_{T_{i}}(1-P_{T_{i}})^{n_{T_{i}}-y_{T_{i}}}\right\} {\prod}^{k}_{i=1}\left\{{~}^{n_{C_{i}}}C_{y_{C_{i}}}P^{y_{C_{i}}}_{C_{i}}(1-P_{C_{i}})^{n_{C_{i}}-y_{C_{i}}}\right\}. \end{array}$$
(1)

In random effects meta-analysis formulation, we assume that PT and PC follow logistic models, and define

$$P_{T_{i}} = \frac {exp\left \{\mu + Tr + \alpha _{i} + e_{i} \right \}}{1 + exp\left \{\mu + Tr + \alpha _{i} + e_{i} \right \}}$$ and $$P_{C_{i}} = \frac {exp\left \{\mu + e_{i} \right \}}{1 + exp\left \{\mu + e_{i} \right \}}$$.

That is, $$logit(P_{T_{i}}) = \mu + Tr + \alpha _{i} +e_{i} ; logit(P_{C_{i}}) = \mu +e_{i}, i=1,\ldots,k$$. This gives,

$$\begin{array}{@{}rcl@{}} logit(P_{T_{i}})- logit(P_{C_{i}})= Tr +\alpha_{i} \ ; i=1,\ldots,k, \end{array}$$

where $$logit(p) = log\left (\frac {p}{1-p}\right)$$ is the log-odds ratio of p, μ is the intercept, Tr is the treatment effect, αi and ei are the study effects and error terms. As proposed by Muthukumarana and Tiwari in Muthukumarana and Tiwari (2016), consider a Bayesian approach and assume that {αi ;i=1,…,k} is a sample from a Dirichlet process with concentration parameter ρ and the baseline distribution H. We assume that the baseline distribution H is $$N\left (0,\sigma ^{2}_{H}\right)$$. More specifically, we assume that

\begin{aligned} \alpha_{i} &\sim DP\left(\rho,H\right) \\ H &\sim N\left(0,\sigma^{2}_{H}\right) \\ e_{i} &\sim N\left(0,\sigma^{2}_{e}\right) \\ f(\mu) &\propto \text{constant} \\ Tr &\sim N\left(0,\sigma^{2}_{Tr}\right) \\ \rho &\sim U\left[0.1,1000\right] \end{aligned}
(2)

where hyper parameters $$\sigma ^{2}_{H}, \sigma ^{2}_{Tr}$$ and $$\sigma ^{2}_{e}$$ are assumed to be known. We now obtain the posterior characterizations of parameters using Neal’s algorithm Neal (2000) using Gibbs sampling as follows.

\begin{aligned} f\left(\alpha_{c}|y_{T_{j}} : c_{j} = c\right) &\propto \prod^{k}_{j : c_{j} = c} \left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{c}+e_{j}\right\}}\right)^{n_{T_{j}}} \\ & exp\left\{\frac{-1}{2\sigma^{2}_{H}} \left(\alpha_{c} - \sigma^{2}_{H} \sum_{j : c_{j} = c} y_{T_{j}}\right)^{2}\right\} \end{aligned}
(3)
\begin{aligned} f\left(e_{i}|\underline{y}\right) &\propto \left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{i}+e_{i}\right\}}\right)^{n_{T_{i}}} \left(\frac{1}{1+exp\left\{\mu+e_{i}\right\}}\right)^{n_{C_{i}}} \\ & exp\left\{\frac{-1}{2\sigma^{2}_{e}} \left(e_{i} - \sigma^{2}_{e}\left(y_{T_{i}} + y_{C_{i}}\right)\right)^{2}\right\} \end{aligned}
(4)
\begin{aligned} f\left(\mu|\underline{y}\right) &\propto \prod^{k}_{j=1} \left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}} \left(\frac{1}{1+exp\left\{\mu+e_{j}\right\}}\right)^{n_{C_{j}}} \\ & exp\left\{\left(\sum^{k}_{j=1} y_{T_{j}} + y_{C_{j}}\right) \mu \right\} \end{aligned}
(5)
\begin{aligned} f\left(Tr|\underline{y}\right) &\propto \prod^{k}_{j=1} \left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}}\\ & exp\left\{\frac{-1}{2\sigma^{2}_{Tr}} \left(Tr - \sigma^{2}_{Tr}\sum^{k}_{j=1} y_{T_{j}}\right)^{2}\right\} \end{aligned}
(6)
$$f\left(\rho|\underline{y}\right) \propto \rho^{r-1}\left(\rho + k\right)B\left(\rho + 1,k\right) I_{\left[0.1,1000\right]}(\rho)$$
(7)

Note that the likelihood in (1) does not account for excessive zeros in the data. For this reason, we now consider a zero inflated binomial model for the data as follows.

\begin{aligned} Y_{T_{i}} \overset{\text{ind}}{\sim} ZIB\left(p_{0},n_{T_{i}},P_{T_{i}}\right), \ \ Y_{C_{i}} \overset{\text{ind}}{\sim} ZIB\left(q_{0},n_{C_{i}},P_{C_{i}}\right), i=1,\ldots,k. \end{aligned}

That is,

\begin{aligned} Y_{T_{i}} = \left\{\begin{array}{ll} 0 & \text{with probability}\ \text{$$p_{0}$$} \\ Bin\left(n_{T_{i}},P_{T_{i}}\right) & \text{with probability}\ \text{$$1-p_{0}.$$} \end{array}\right. \end{aligned}

Similarly,

\begin{aligned} Y_{C_{i}} = \left\{\begin{array}{ll} 0 & \text{with probability}\ \text{$$q_{0}$$} \\ Bin\left(n_{C_{i}},P_{c_{i}}\right) & \text{with probability}\ \text{$$1-q_{0}.$$} \end{array}\right. \end{aligned}

This modification brings two more extra parameters to the model and we assume that

\begin{aligned} p_{0} &\sim Beta\left(a,b\right) \\ q_{0} &\sim Beta\left(c,d\right). \end{aligned}
(8)

where hyper parameters a,b,c and d are assumed to be known. We obtain the the posterior characterizations of parameters under zero inflated binomial likelihood as follows.

\begin{aligned} f\left(\alpha_{c}|y_{T_{j}} : c_{j} = c\right) &\propto \prod^{k}_{j : c_{j} = c} \left[p_{0} + \left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{c}+e_{j}\right\}}\right)^{n_{T_{j}}}\right]^{u_{j}} \\ & \left[\left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{c}+e_{j}\right\}}\right)^{n_{T_{j}}}exp\left\{\alpha_{c} y_{T_{j}}\right\}\right]^{1-u_{j}} \\ & exp\left\{\frac{-1}{2\sigma^{2}_{H}} \alpha^{2}_{c}\right\} \end{aligned}
(9)
\begin{aligned} f\left(e_{i}|\underline{y}\right) &\propto \left[p_{0} + \left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{i}+e_{i}\right\}}\right)^{n_{T_{i}}}\right]^{u_{i}} \\ & \left[\left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{i}+e_{i}\right\}}\right)^{n_{T_{i}}}exp\left\{y_{T_{i}}e_{i}\right\}\right]^{1-u_{i}} \\ & \left[q_{0} + \left(1-q_{0}\right)\left(\frac{1}{1+exp\left\{\mu+e_{i}\right\}}\right)^{n_{C_{i}}}\right]^{w_{i}} \\ & \left[\left(1-q_{0}\right)\left(\frac{1}{1+exp\left\{\mu+e_{i}\right\}}\right)^{n_{C_{i}}}exp\left\{y_{C_{i}}e_{i}\right\}\right]^{1-w_{i}} \\ & exp\left\{\frac{-1}{2\sigma^{2}_{e}} e^{2}_{i}\right\} \end{aligned}
(10)
\begin{aligned} f\left(\mu|\underline{y}\right) &\propto \prod^{k}_{j=1} \left[p_{0} + \left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}}\right]^{u_{j}} \\ & \left[\left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}} exp\left\{y_{C_{j}} \mu \right\} \right]^{1-u_{j}} \\ & \left[q_{0} + \left(1-q_{0}\right)\left(\frac{1}{1+exp\left\{\mu+e_{j}\right\}}\right)^{n_{C_{j}}}\right]^{w_{j}} \\ & \left[\left(1-q_{0}\right)\left(\frac{1}{1+exp\left\{\mu+e_{j}\right\}}\right)^{n_{C_{j}}} exp\left\{y_{C_{j}} \mu \right\} \right]^{1-w_{j}} \end{aligned}
(11)
\begin{aligned} f\left(Tr|\underline{y}\right) &\propto \prod^{k}_{j=1} \left[p_{0} + \left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}}\right]^{u_{j}} \\ & \left[\left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}} exp\left\{y_{T_{j}}Tr\right\}\right]^{1-u_{j}} \\ & exp\left\{\frac{-1}{2\sigma^{2}_{Tr}} Tr^{2}\right\} \end{aligned}
(12)
$$f\left(\rho|\underline{y}\right) \propto \rho^{r-1}\left(\rho + k\right)B\left(\rho + 1,k\right) I_{\left[0.1,1000\right]}(\rho)$$
(13)
\begin{aligned} f\left(p_{0}|\underline{y}\right) &\propto \left[p_{0} + \left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}}\right]^{u_{j}} \\ & \left[\left(1-p_{0}\right)\left(\frac{1}{1+exp\left\{\mu+Tr+\alpha_{j}+e_{j}\right\}}\right)^{n_{T_{j}}} exp\left\{y_{T_{j}}\left(\mu\,+\,Tr\,+\,\alpha_{j}\,+\,e_{j}\right)\right\}\right]^{1-u_{j}} \\ & p_{0}^{a-1}\left(1-p_{0}\right)^{b-1} \end{aligned}
(14)
\begin{aligned} f\left(q_{0}|\underline{y}\right) &\propto \left[q_{0} + \left(1-q_{0}\right)\left(\frac{1}{1+exp\left\{\mu+e_{j}\right\}}\right)^{n_{C_{j}}}\right]^{w_{j}} \\ & \left[\left(1-q_{0}\right)\left(\frac{1}{1+exp\left\{\mu+e_{j}\right\}}\right)^{n_{C_{j}}} exp\left\{y_{C_{j}}\left(\mu+e_{j}\right)\right\}\right]^{1-w_{j}} \\ & q_{0}^{c-1}\left(1-q_{0}\right)^{d-1} \end{aligned}
(15)

where $$u_{j} = \left \{\begin {array}{ll} 1, & y_{T_{j}} = 0 \\ 0, & y_{T_{j}} = 1 \end {array}\right.$$ and $$w_{j} = \left \{\begin {array}{ll} 1, & y_{C_{j}} = 0 \\ 0, & y_{C_{j}} = 1. \end {array}\right.$$

We investigate the suitability of the zero inflated binomial distribution using the log pseudo marginal likelihood (LPML) Gelfand et al. (1992) in Section 4.

## Data analysis

We illustrate the approach discussed in Section 2 using a published data set on counts of the number of people experiencing myocardial infarction from the use of drugs with an active ingredient “rosiglitazone" Nissen and Wolski (2007). The data used in this section provides information on diabetes patients, 42 diabetes trials having zero events in both arms, and possible heart condition or death resulting from the use of rosiglitazone. Rosiglitazone is a treatment used to treat patients with type 2 diabetes. The data provide information on diabetes patients, 42 diabetes trials, and possible heart condition or death resulting from the use of rosiglitazone. Rosiglitazone is a treatment for diabetes widely used in treating patients with type 2 diabetes. We separately apply the model on myocardial infarction and death from cardiovascular based on these 42 studies. There were 86 myocardial infarctions in the rosiglitazone group and 72 in the control group. There were 39 deaths from cardiovascular causes in the rosiglitazone group and 22 in the control group. Note that the percentages of observed zeros from the 42 studies in the treatment and control arms for myocardial infarction are 23% and 57% respectively. Similar percentages for cardiovascular causes are 50% and 80% respectively. We set the hyper parameters as $$a = b = c = d = 1, \sigma ^{2}_{H} = 2, \sigma ^{2}_{Tr} = 2$$ and $$\sigma ^{2}_{e} = 2$$. Note that the choice of these values result sufficiently diffuse priors in the range of logit scale of primary parameters. We implement the models developed in Section 2 using R. The results are based on a MCMC simulation with a burn-in period of 1000 iterations followed by 30,000 iterations using thinning of 5. We use the data from the 42 studies, without stratifying them into small and large studies, as the purpose of the proposed work is an illustration of the method and not in in-depth analysis of the data by using different methods or by slicing and dicing the data. The posterior box plots of study effects under two models on myocardial infarction are given in Figs. 1 and 2. The advantage of using DP prior is the flexibility and also the ability to cluster studies appropriately. The clustering is based on the values assigned to each study effects based on their posterior distributions, which are approximated using MCMC. Those studies that share the same study effects will be considered to belong to the same group. Note that there were 5 clusters in myocardial infarction and 4 clusters in cardiovascular causes based on study effects. To evaluate the performance between Binomial and ZIB models, we use the LPML which is based on Conditional Predictive Ordinates(CPO). A detailed discussion of the CPO statistic and its applications to model selection can be found in Geisser (1993) and Gelfand and Dey (1994). The LPML is computed as $$\sum \limits _{i=1}^{k} log p(y_{i}|y_{-i})$$ where yi denotes the observation vector y with the ith observation deleted. The model with larger value of LPML is preferred. The estimates of μ, Tr and the LPML values are given in Table 1. The LPML prefers binomial model over the ZIB model and the two models estimate the parameter Tr differently.

We now investigate the study effects on death from cardiovascular causes. The posterior box plots of study effects under two models on death from cardiovascular causes are given in Figs. 3 and 4. The plots indicate that ZIB model is capable in capturing the heterogeneity of study effects. The estimates of μ, Tr and the LPML values are given in Table 1. In this case, the LPML strongly prefers ZIB model over the binomial model. This is in agreement with the fact that there are large amount of excessive zeros on death from cardiovascular causes relative to myocardial infarction.

A summary of estimates of odds ratios under Binomial, zero inflated Binomial and DerSimonian- Laird random effects models are given in Figs. 5 and 6. For myocardial infarction, DerSimonian- Laird random effects model gives an overall odds ratio of 1.29 with a 95% confidence interval of (0.9,1.85). On the other hand, Binomial and zero inflated Binomial models provide an overall summary of odds ratio of 1.04 (0.98, 1.1) and 1.07 (0.97, 1.17) respectively. These estimates and 95% credible intervals for cardiovascular causes are 1.2 (0.64, 2.24), 1.03 (0.97, 1.09) and 1.13 (0.93, 1.33) respectively. It is clear that our approach provides overall odds ratios estimates that are slightly lower than that from DerSimonian- Laird overall estimate. Note that DerSimonian-Laird approach is based on the non-zero studies. Also note that Binomial and zero inflated Binomial models identify more heterogeneous study effects than DerSimonian- Laird random effects model. According to Figs. 5 and 6, we notice that zero inflated Binomial model identifies more heterogeneous effects than Binomial model while Binomial model identifies more heterogeneous effects than DerSimonian- Laird approach. DerSimonian- Laird estimated random effects variances are zero for both scenarios and this suggests that our approach is superier than DerSimonian- Laird random effects model when there is heteregeniety among studies and LPML model selection criteria will choose the best model in terms of prediction ability.

We now examine the effects of zero inflated parameters p0 and q0 on the analysis. The graphical posterior summaries of p0 and q0 on myocardial infarction and cardiovascular causes are given in Figs. 7, 8, 9 and 10. In addition, the numerical posterior summaries of p0 and q0 are given in Table 2. It is clear that the posterior distributions of p0 and q0 and their numerical summaries for myocardial infarction and cardiovascular causes make sense with respect to the percentages of zeros in the data. We also consider a Beta(0.5,0.5) prior on p0 and q0 in order to investigate the prior sensitivity. The numerical posterior summaries of p0 and q0 under Beta(0.5,0.5) prior are given in Table 3. We notice a magnitude change in estimates of p0 and q0 in this case but the estimates of primary parameters μ and Tr are very close indicating that odds ratios are not sensitive to the choice of prior settings. This indicates that inference on p0 and q0 will be sensitive to the choice of priors so one should select these priors carefully based on application specific apriori knowledge on zero inflated parameters.

It is important to look at some convergence assessment plots related to the MCMC simulation as this is a high dimensional problem. The trace plot, histogram and autocorrelation plot of μ under binomial model on Myocardial Infarction are given in Fig. 11. The trace plot appears to stabilize immediately and hence provides no indication of lack of convergence in the Markov chain. The autocorrelation plot also appears to dampen quickly. Trace plots of study effects on Myocardial Infarction are given in Fig. 12. The trace plots of study effects on death from cardiovascular causes indicate similar behavior. Similar plots were obtained for all of the parameters under each model and provide the evidence of the convergence of the Markov chains.

Note that one can also assign a simpler parametric normal prior on study effects αi in place of the DP prior. We now re-analyze the data assuming that study effects are arising from a $$N\left (0,\sigma ^{2}_{H}\right)$$ prior distribution. We remark that this is the baseline distribution of the DP prior in (2). In this case, forest plots of odds ratios for each model are given in Fig. 13. The estimates of primary parameters of interest and LPML values are given in Table 4. The LPML model selection criteria clearly indicates that the DP prior in (2) is superior than the conventional parametric prior.

Note that the overall decision to assess the safety should be based on p0,q0 and the overall odds ratio (OR). For example, the treatment can be declared is to be safer than the control, if OR≤1, and p0>q0. Also notice that estimates of (p0,q0) are independent of the odds ratio because the counts cannot be in “true" zero arms and “Binomial" arms. We combine the two metrics, conditional OR and (p0,q0), to come up with an overall unconditional odds ratio. We define it to be modified odds ratio= OR ×(1−p0)/(1−q0). Note that when p0=q0, modified odds ratio is same as OR. If p0>q0, this adjusts OR, by multiplying by a factor less than 1, and if p0<q0, it adjust OR by multiplying by a factor >1. This factor, h(p0,q0)=(1−p0)/(1−q0) is the ratio of probabilities of observing Bernoulli counts in the two arms, and can be considered as odds for observing Bernoulli counts in the two arms. In frequentist setup, $$h(\hat {p}_{0},\hat {q}_{0})$$ is independent of $$\hat {\mu }$$, and hence independent of conditional odds ratio. In fact, $$\hat {p}_{0}$$ and $$\hat {q}_{0}$$ converge to p0 and q0 with probability 1, and hence $$h(\hat {p}_{0},\hat {q}_{0})$$ also converge to h(p0,q0) with probability 1, as h is a continuous function (from Slutsky’s theorem). So, the estimated modified odds ratio is a consistent estimator for unconditional odds ratio defined as OR×(1−p0)/(1−q0). We provide the estimates of the modified odds ratio for various models in Table 5. As estimate of p0 is less than q0 for both examples (Myocardial Infarction and cardiovascular causes), the modified OR values are higher than the corresponding OR values.

## Results from simulation studies

To understand the role of p0 and q0 in the model, different simulation studies were carried out. For this purpose, we generate random ZIB values with empirical binomial parameters. We first generate 42 pairs of independent binary, 0 and 1, variables from Bernoulli (p0) and Bernoulli (q0) where p0 and q0 are from the set of values {(0.1,0.1),...,(0.9,0.9)}. We then assign the true-zeros at the places with 1s, and generate binomial outcomes from $$B(\bar {n}_{T}, \hat {P}_{T_{i}})$$ and from $$B(\bar {n}_{C}, \hat {P}_{C_{i}})$$, where $$\bar {n}_{T}, \bar {n}_{C}, \hat {P}_{T_{i}}$$ and $$\hat {P}_{C_{i}}$$ are empirical estimates. Then, MCMC sampling scheme described in Section 2 was carried out using R to obtain the posterior estimate of p0 and q0. This was done 1000 times for each pair to obtain the mean and standard error of each estimate. For various scenarios of excessive zeros, the results are given in Table 6. The results indicate that when true values of p0 is small and the observed values of zeros in the simulated data in treatment arm (control arm) is also small (large), the estimated values of p0 and q0 are also small (large), whereas when the values of p0 and q0 are large the simulated data has large proportion of zeros in both the arms, this results in large estimated values of p0 and q0. In both the situations, the estimated values of p0 and q0 are in conformity with the observed percentages of zeros in the simulated data. The estimates of p0 and q0 remain high in spite of their true choices from the parameter values. Note that our primary interest is on alphas and on treatment arm not on the control arm, so we may not need to investigate q0 very well as it can be trated as nuisance parameter. In practice, one should have a very good apriori knowledge of q0 which will allow to assign an informative prior as it is reflecting the zeros in the control arm. This indicates that the use of ZIB is more appropriate when there are excessive zeros in the data.

## Discussion

Binary data naturally arise in clinical trials in health sciences. In some cases, they arise with excessive zeros. In this paper, we have provided a random effects meta analysis approach for binary data with excessive zeros in two-arm trials. The suitability of the binomial and zero inflated binomial model was assessed in the presence of Dirichlet process as the prior for the study effects. The approach can be used as a template for meta analysis of binary data and a user may choose the proper model using log pseudo marginal likelihood. We have shown that our approach is superior than DerSimonian- Laird random effects model when there is heterogeneity among studies and LPML model selection criteria can be used to selection the best model among the Bayesian models (not including DerSimonian-Laird model) for a given data set.

The Bayesian approaches discussed in this paper allowed to incorporate the zero-studies in the likelihood, and we found that the point estimates of the overall odds-ratio from these methods, were lower than the estimates reported in the literature Nissen and Wolski (2007). The use of ZIB model was to identify the percentage of excessive zeros, that is, the studies where the events could not occur, from the (Binomially) modeled zeros where the zero events occurred. Note that under ZIB, some zeros are observed with probability p0 and some from Binomial model, making the probability of zero-event to be p0+(1−p0)(1−PT)nT in the treatment arm. With the use of ZIB model, the Bayes estimates of the odds-ratio went slightly up than with the use of Binomial model, but still they were lower than the results from DerSimonian-Laird random effects model and the resulting estimates in Nissen and Wolski (2007). Note also that DP model being discrete with probability 1, has a clustering property, where the study effects, that are alike, fall in the same cluster. We also investigated the suitability of the DP prior over the conventional parametric normal prior on study effects. The LPML model selection indicated that DP prior is superior than the conventional parametric normal prior. Finally, as the results from ZIB model on the parameters p0,q0 and OR need to be interpreted together, a modified OR was introduced.

As a future direction of research, we would like to extend the approach discussed in this article for ordinal category data. For example, in some applications, the clinical trial end point could be a response variable in an ordinal scale with multiple categories such as Good/Moderate/Critical etc. This type of ordinal response data can be viewed as multivariate responses arising from continuous latent variables with cut-points. We assume that there is a continuous latent outcome behind these ordinal outcomes such that Xi=(Xi1,…,Xim)Normal(μ,Σ) where X’s are the latent outcomes and m is the number of ordinal categories. Then the latent variables Xij’s can be converted to the observed Yij using a cut-point vector λ. However the choice of cut-points and their priors need to be carefully selected as there are two arms and the counts on categories could be sparse. In this case, one can consider an objective Bayes approach following the development in Bayarri et al. (2008). Yet another extension of the proposed model is where there are multinomial data with some particular cell(s) being observed excessively. This kind of data may arise from trials with patient reported outcomes.

## Availability of data and materials

Data and code can be requested by contacting the authors.

## References

1. Albert, J.: Teaching Inference about Proportions Using Bayes and Discrete Models. J. Stat. Educ. 3 (1995). https://doi.org/10.1080/10691898.1995.11910494.

2. Bayarri, M. J., Berger, J. O., Datta, G. S.: Objective Bayes testing of Poisson versus inflated poisson models. Inst. Math. Stat. 3, 105–121 (2008).

3. Branscum, A. J., Hanson, T. E.: Bayesian nonparametric meta-analysis using Polya tree mixture models. Biometrics. 64, 825–833 (2008).

4. Burr, D., Doss, H.: A Bayesian semiparametric model for random-effects meta-analysis. J. Am. Stat. Assoc. 100, 242–251 (2005).

5. Carlin, J. B.: Meta-analysis for 2 ×2 tables: A bayesian approach. Stat. Med. 11, 141–158 (1992).

6. Chang, B. H., Waternaux, C., Lipsitz, S.: Meta-analysis of binary data: which within study variance estimate to use?Stat. Med. 20, 1947–1956 (2001).

7. DerSimonian, R., Laird, N.: Meta-analysis in clinical trials. Control. Clin. Trials. 7, 177–188 (1986).

8. Ferguson, T. S.: Prior distributions on spaces of probability measures. Ann. Stat. 2, 615–629 (1974).

9. Gamalo, M., Wu, R., Tiwari, R.: Bayesian approach to noninferiority trials for proportions. J. Biopharm. Stat. 21, 902–919 (2011).

10. Geisser, S.: Predictive Inference: An Introduction. Chapman and Hall, London (1993).

11. Gelfand, A. E., Dey, D. K.: Bayesian Model Choice: Asymptotics and Exact Calculations. J. R. Stat. Soc. Ser. B. 56, 501–514 (1994).

12. Gelfand, A. E., Dey, D. K., Chang, H.: Model determination using predictive distributions with implementation via sampling-based methods (with discussion). Bayesian Statistics 4(Bernardo, J. M., Berger, J. O., Dawid, A. P., Smith, A. F. M., eds.)Oxford University Press (1992).

13. Ghosh, S. K., Mukhopadhyay, P., Lu, J. C.: Bayesian analysis of zero-inflated regression models. J. Stat. Plan. Infer. 136(4), 1360–1375 (2006).

14. Hall, D. B.: Zero-Inflated Poisson and Binomial Regression with Random Effects: A Case Study. Biometrics. 56, 1030–1039 (2000).

15. Huang, L., Zheng, D., Zalkikar, J., Tiwari, R.: Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection. Stat. Methods Med. Res. (2014). https://doi.org/10.1177/0962280214549590.

16. Muthukumarana, S., Tiwari, R.: Meta-analysis using dirichlet process. Stat. Methods Med. Res. 25(1), 352–365 (2016).

17. Neal, RM: Markov Chain Sampling Methods for Dirichlet Process Mixture Models. J. Comput. Graph. Stat. 9(2), 249–265 (2000).

18. Nissen, S. E., Wolski, K.: Effect of rosiglitazone on the risk of myocardial infarction and death from cardiovascular causes. New Eng. J. Med. 356, 2457–2471 (2007).

19. Smith, T. C., Spiegelhalter, D. J., Thomas, A.: Bayesian approaches to random-effects meta-analysis: a comparative study. Stat. Med. 14, 2685–2699 (1995).

## Acknowledgments

The authors thank Editor-in-Chief and three anonymous reviewers whose comments helped to improve the manuscript. This article reflects the views of the authors and should not be attributed to FDA’s views or policies.

## Funding

Muthukumarana’s research has been partially supported by a Discovery grant from the Natural Sciences and Engineering Research Council of Canada. Martell’s research internship was funded by Mitacs Globalink program.

## Author information

Authors

### Contributions

All authors have contributed equally to the work and approved the final version of the paper.

### Corresponding author

Correspondence to Saman Muthukumarana.

## Ethics declarations

### Competing interests

The authors declare that they have no competing interests. 