Open Access

Multivariate zero-truncated/adjusted Charlier series distributions with applications

Journal of Statistical Distributions and Applications20152:5

https://doi.org/10.1186/s40488-015-0029-5

Received: 15 May 2015

Accepted: 20 July 2015

Published: 4 August 2015

Abstract

Although the univariate Charlier series distribution (Biom. J. 30(8):1003–1009, 1988) and bivariate Charlier series distribution (Biom. J. 37(1):105–117, 1995; J. Appl. Stat. 30(1):63–77, 2003) can be easily generalized to the multivariate version via the method of stochastic representation (SR), the multivariate zero-truncated Charlier series (ZTCS) distribution is not available to date. The first aim of this paper is to propose the multivariate ZTCS distribution by developing its important distributional properties, and providing efficient likelihood-based inference methods via a novel data augmentation in the framework of the expectation–maximization (EM) algorithm. Since the joint marginal distribution of any r-dimensional sub-vector of the multivariate ZTCS random vector of dimension m is an r-dimensional zero-deflated Charlier series (ZDCS) distribution (1≤r<m), it is the second objective of the paper to introduce a new family of multivariate zero-adjusted Charlier series (ZACS) distributions (including the multivariate ZDCS distribution as a special member) with a more flexible correlation structure by accounting for both inflation and deflation at zero. The corresponding distributional properties are explored and the associated maximum likelihood estimation method via EM algorithm is provided for analyzing correlated count data. Some simulation studies are performed and two real data sets are used to illustrate the proposed methods.

Mathematics subject classification primary:62E15; Secondary 62F10

Keywords

EM algorithm Multivariate zero-adjusted Charlier series distributions Multivariate zero-truncated Charlier series distribution Univariate Charlier series distribution

Introduction

The univariate Charlier series (CS) distribution was first introduced by Ong (1988) in the consideration of the conditional distribution of a bivariate Poisson distribution. The CS distribution is a convolution of a binomial variate and a Poisson variate. Let X 0Binomial(K,π), X 1Poisson(λ), and (X 0,X 1) be mutually independent (denoted by X 0 X 1). Then a discrete non-negative random variable X is said to follow the CS distribution with parameters \(K \in {\mathbb {N}} \;\hat {=}\; \{1, 2, \ldots, \infty \}\), π [ 0,1) and \(\lambda \in {\mathbb {R}}_{+}\), denoted by XCS(K,π;λ), if it can be stochastically represented by X=X 0+X 1. Its probability mass function (pmf) is given by
$$ \Pr(X = x) = \sum_{k=0}^{\min(K, x)} {K \choose k} \pi^{k} (1-\pi)^{K-k} \cdot \frac{\lambda^{x-k} \mathrm{e}^{-\lambda}} {(x-k)!}, \qquad x=0,1, \ldots, \infty. $$
(1.1)
The mean and variance of X are given by
$$ E(X) = K\pi + \lambda \quad \text{and} \quad \text{Var}(X) = K\pi(1-\pi) + \lambda. $$
(1.2)
Let X 1,…,X n iidCS(K,π;λ) and the observed data be Y obs={x 1,…,x n }, where x 1,…,x n are the realizations of X 1,…,X n . Let \(\bar {x}\) and s 2 be the sample mean and variance, respectively. Assuming K is known, Ong (1988) derived the moment estimates of the parameters in the univariate CS distribution as follows:
$$ \hat{\pi} = \left(\frac{\bar{x} - s^{2}}{K} \right)^{1/2} \quad \text{and} \quad \hat{\lambda} = \bar{x} - K \hat{\pi}. $$
(1.3)
Next, Papageorgiou and Loukas (1995) proposed a bivariate CS distribution which arises as the conditional distribution from a trivariate Poisson distribution studied by Loukas and Papageorgiou (1991) and Loukas (1993). Let X 0Binomial(K,π) and X i0indPoisson(λ i ), i=1,2 and define X i =X 0+X i0, i=1,2. Then a discrete non-negative random vector x=(X 1,X 2) is said to follow a bivariate CS distribution with parameters \(K \in {\mathbb {N}}\), π [ 0,1) and \(\lambda _{i} \in {\mathbb {R}}_{+}\), i=1,2. We denote it by xCS(K,π;λ 1,λ 2). Its probability generating function, marginal means and the covariance are given by
$$\begin{array}{@{}rcl@{}} G_{\textbf{x}}\left(\boldsymbol{z}\right) &=& E\left(z_{1}^{X_{1}}z_{2}^{X_{2}}\right) = \exp\{\lambda_{1}(z_{1} -1) + \lambda_{2} (z_{2}-1)\}\left[(1-\pi) + \pi z_{1} z_{2}\right]^{K}, \end{array} $$
(1.4)
$$\begin{array}{@{}rcl@{}} [2mm] E(X_{i}) &=& K \pi + \lambda_{i} \quad \text{and} \quad \text{Cov}(X_{1}, X_{2}) = K \pi (1- \pi), \quad i= 1, 2. \end{array} $$
(1.5)
Let x 1,…,x n indCS(K,π;λ 1,λ 2), where x j =(X 1j ,X 2j ) for j=1,…,n and the observed data be Y obs={x 1,…,x n }, where x 1,…,x n are the realizations of x 1,…,x n . Let \(\bar {x}_{1}, \bar {x}_{2}\) be the sample mean for X 1 and X 2 and m 11 be the sample covariance, respectively. Assuming K is known, Papageorgiou and Loukas (1995) obtained the moment estimates of the three parameters as follows:
$$ \hat{\pi} = \frac{1}{2} \pm \left(1 - \frac{4 m_{11}}{K}\right)^{1/2}, \quad \hat{\lambda}_{1} = \bar{x}_{1} - K \hat{\pi} \quad \text{and} \quad \hat{\lambda}_{2} = \bar{x}_{2} - K \hat{\pi}. $$
(1.6)

In addition, Papageorgiou and Loukas (1995) also discussed the method of ratio of frequencies and the maximum likelihood estimate method.

Although the univariate Charlier series distribution (Ong 1988) and bivariate Charlier series distribution (Karlis 2003, Papageorgiou and Loukas 1995) can be easily generalized to the multivariate version via the method of stochastic representation (SR), the multivariate zero-truncated Charlier series (ZTCS) distribution is not available to date. The first aim of this paper is to propose the multivariate ZTCS distribution by developing its important distributional properties, and providing efficient likelihood-based inference methods via a novel data augmentation in the framework of the expectation–maximization (EM) algorithm. Since the joint marginal distribution of any r-dimensional sub-vector of the multivariate ZTCS random vector of dimension m is an r-dimensional zero-deflated Charlier series (ZDCS) distribution (1≤r<m), it is the second objective of the paper to introduce a new family of multivariate zero-adjusted Charlier series (ZACS) distributions (including the multivariate ZDCS distribution as a special member) with a more flexible correlation structure by accounting for both inflation and deflation at zero. The corresponding distributional properties are explored and the associated maximum likelihood estimation method via EM algorithm is provided for analyzing correlated count data.

The rest of the paper is organized as follows. In Section 2, the multivariate ZTCS distribution is proposed and some important distributional properties are explored. In Section 3, the likelihood-based methods are developed for the multivariate ZTCS distribution. In Sections 4 and 5, we introduce the multivariate ZACS distribution, explore its distributional properties and provide associated likelihood-based methods for the case of without covariates. In Section 6, some simulation studies are performed to evaluate the proposed methods. In Section 7, two real data sets are used to illustrate the proposed methods. Section 8 provides some concluding remarks.

Multivariate zero-truncated Charlier series distribution

Let X 00Binomial(K,π), \(\{X_{i0}\}_{i=1}^{m} \stackrel {\text {ind}}{\sim } \text {Poisson}\,(\lambda _{i})\), X 00 {X 10,…,X m0} and define
$$ X_{i} = X_{00} + X_{i0}, \qquad i=1,\dots,m. $$
A discrete non-negative random vector x=(X 1,…,X m ) is said to follow an m-dimensional CS distribution with parameters \(K \in {\mathbb {N}} = \{1, 2, \ldots, \infty \}\), π [ 0,1) and \(\boldsymbol {\lambda }=(\lambda _{1},\ldots,\lambda _{m})^{\!\top \!} \in {\mathbb {R}}_{+}^{m}\), denoted by xCS (K,π;λ 1,…,λ m ) or xCS m (K,π;λ), accordingly. The joint pmf of x is
$$ \Pr(\mathbf{x} ={\boldsymbol{x}}) = \sum_{k=0}^{\min(K, {\boldsymbol{x}})} {K \choose k}\pi^{k} (1-\pi)^{K-k} \prod_{i=1}^{m} \frac{\lambda_{i}^{x_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(x_{i}-k)!} \;\hat{=}\; Q_{{\boldsymbol{x}}}(K, \pi, \boldsymbol{\lambda}), $$
(2.1)

where x=(x 1,…,x m ), \(\{x_{i}\}_{i=1}^{m}\) are the corresponding realizations of \(\{x_{i}\}_{i=1}^{m}\), and \(\min (K, {\boldsymbol {x}}) \;\hat {=}\;\) min(K,x 1,…,x m ).

In particular, as K and K π remains finitely large (say, λ 0), the distribution of Binomial(K,π) tends to the distribution of Poisson(λ 0), so the above m-dimensional CS distribution approaches to the m-dimensional Poisson distribution MP(λ 0,λ 1,…,λ m ). Furthermore, if π=0, then Pr(X 00=0)=1 (i.e., X 00 follows the degenerate distribution with all mass at zero, denoted by X 00Degenerate(0)) and λ 0=0, so the m-dimensional CS distribution becomes the product of m independent Poisson(λ i ) distributions.

Motivated by the Type I I multivariate zero-truncated Poisson (ZTP) distribution developed recently by Tian et al. (2014), we in this paper propose a new multivariate zero-truncated Charlier series (ZTCS) distribution, whose limiting form reduces to the Type I I multivariate ZTP distribution.

Definition 1.

Let xCS (K,π;λ 1,…,λ m ). A discrete non-zero random vector w=(W 1,…, W m ) is said to have the multivariate ZTCS distribution with the parameters (K,π) and λ=(λ 1,…,λ m ), denoted by wZTCS m (K,π;λ) or wZTCS(K,π;λ 1,…,λ m ), if
$$ \mathbf{x} \stackrel{\mathrm{d}}{=} U \, \mathbf{w} =\left\{ \begin{array}{ll}{\mathbf{0}}, &{\text{with probability}\ \psi},\\ {\mathbf{w}}, & {\text{with probability}\ 1-\psi,} \end{array}\right. $$
(2.2)

where UBernoulli (1−ψ) with \(\psi =(1-\pi)^{K} \mathrm {e}^{-\lambda _{+}}, \lambda _{+} = \sum _{i=1}^{m} \lambda _{i} \;\hat {=}\; \|\boldsymbol {\lambda }\|_{_{1}}\), and U w.

Let wZTCS m (K,π;λ), then we have Pr(w=0)=0 and
$$ \mathbf{w} \stackrel{\mathrm{d}}{=} \mathbf{x} | (\mathbf{x} \ne \mathbf{0}), $$
(2.3)

where x is specified in Definition 1. The SR (2.3) can be used to generate the ZTCS random vector w via the generation of the random vector x from the multivariate CS distribution, while the SR (2.2) is useful in deriving important distributional properties in the following subsections and in developing an EM algorithm in Section 3.1. Moreover, besides coming from the missing zero vector, the correlation between any two components of w may come from the common random variable X 00Binomial(K,π).

2.1 Joint probability mass function and mixed moments

From the SR (2.2), the joint pmf of wZTCS m (K,π;λ) is
$$\begin{array}{@{}rcl@{}} f({\boldsymbol{w}}; K, \pi, \boldsymbol{\lambda}) &=& \Pr(\mathbf{w}={\boldsymbol{w}}) \mathop{=}\limits^{(2.2)} \frac{\Pr(\mathbf{x} ={\boldsymbol{w}})} {\Pr(U=1)} \\ [2mm] &=& \frac{1}{1- (1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{k=0}^{\min(K, {\boldsymbol{w}})} {K \choose k}\pi^{k}(1-\pi)^{K-k}\prod_{i=1}^{m} \frac{\lambda_{i}^{w_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(w_{i}-k)!}, \qquad \end{array} $$
(2.4)
where \(\|{\boldsymbol {w}}\|_{_{1}} \ne 0\). From (2.2), it is easy to show that
$$ \left\{ \begin{array}{lll} E(\mathbf{w}) &=& \frac{\boldsymbol{\lambda} + K \pi \cdot \textbf{1}\!\!\!\mathbf{1}}{1-\psi}, \\ [1mm] E(\mathbf{w}\mathbf{w}^{\!\top\!}) &=& \frac{\text{diag}(\boldsymbol{\lambda}) + \boldsymbol{\lambda} \boldsymbol{\lambda}^{\!\top\!} + K \pi (\boldsymbol{\lambda} \mathbf{1}\!\!\!\textbf{1}^{\!\top\!} + \textbf{1}\!\!\!\mathbf{1}\boldsymbol{\lambda}^{\!\top\!}) + K\pi(1- \pi + K\pi) \cdot \mathbf{1}\!\!\!\mathbf{1} \textbf{1}\!\!\!\textbf{1}^{\!\top\!} }{1-\psi}, \\ [1mm] \text{Var}(\mathbf{w}) &=& \frac{1}{1-\psi} \bigg\{\text{diag}(\boldsymbol{\lambda}) + K\pi(1 - \pi) \cdot \mathbf{1}\!\!\!\mathbf{1} \mathbf{1}\!\!\!\mathbf{1}^{\!\top\!} \\ [2mm] & & - \; \frac{\psi}{1-\psi}\left[\boldsymbol{\lambda} \boldsymbol{\lambda}^{\!\top\!} + K \pi \left(\boldsymbol{\lambda} \mathbf{1}\!\!\!\mathbf{1}^{\!\top\!} + \mathbf{1}\!\!\!\mathbf{1}\boldsymbol{\lambda}^{\!\top\!}\right) + K^{2}\pi^{2} \mathbf{1}\!\!\!\mathbf{1}\mathbf{1}\!\!\!\mathbf{1}^{\!\top\!} \, \right] \bigg\}, \end{array} \right. $$
(2.5)
where 1 1=1 1 m =(1,…,1). Thus we have
$$\begin{array}{@{}rcl@{}} & &\!\!\!\!\!\!\!\!\!\!\! \text{Corr}(W_{i}, W_{j}) = \\ [2mm] & & \frac{K\pi(1-\pi) - \frac{\psi}{1-\psi}(\lambda_{i}+K\pi)(\lambda_{j}+K\pi)}{\sqrt{\left[\lambda_{i} + K\pi(1-\pi) - \frac{\psi}{1-\psi}(\lambda_{i} + K\pi)^{2} \right] \left[\lambda_{j} + K\pi(1-\pi) - \frac{\psi}{1-\psi}(\lambda_{j}+K\pi)^{2} \right]}}, \qquad \end{array} $$
(2.6)
for ij. In particular, when π=0, (2.6) becomes
$$\text{Corr}(W_{i}, W_{j}) = - \sqrt{\frac{\lambda_{i} \lambda_{j}}{(\mathrm{e}^{\lambda_{+}} - 1 -\lambda_{i}) (\mathrm{e}^{\lambda_{+}} - 1 -\lambda_{j})}}, \qquad i \ne j. $$
In (2.6), let λ i =λ j =λ, we obtain
$$\text{Corr}(W_{i}, W_{j}) = \frac{K\pi(1-\pi) - \frac{\psi}{1-\psi}(\lambda+K\pi)^{2}}{\lambda + K\pi(1-\pi) - \frac{\psi}{1-\psi}(\lambda + K\pi)^{2} }, \qquad i \ne j. $$
For any r 1,…,r m ≥0, the mixed moments of w are given by
$$ E\left(\prod_{i=1}^{m} W_{i}^{r_{i}}\right) = (1-\psi)^{-1}E\left(\prod_{i=1}^{m} X_{i}^{r_{i}}\right) = (1-\psi)^{-1}E\left[\prod_{i=1}^{m} (X_{00} + X_{i0})^{r_{i}}\right]. $$
(2.7)

2.2 Moment generating function

Using the identity of E(ξ)=E[ E(ξ|U)], the moment generating function (mgf) of x is
$$\begin{array}{@{}rcl@{}} M_{\mathbf{x}}(\boldsymbol{t}) &=& E[\!\exp(\boldsymbol{t}^{\!\top\!} \mathbf{x})] = E[\!\exp(U \cdot \boldsymbol{ t}^{\!\top\!} \mathbf{w})] = E\left\{ E[\!\exp(U \boldsymbol{t}^{\!\top\!} \mathbf{w}) | U] \right\} \\ [2mm] &=& E\left[M_{\mathbf{w}}(U \boldsymbol{t})\right] = \psi M_{\mathbf{w}}(\mathbf{0}) + (1-\psi) M_{\mathbf{w}}(\boldsymbol{t}) = \psi + (1-\psi)M_{\mathbf{w}}(\boldsymbol{t}). \end{array} $$
Thus the mgf of wZTCS(K,π;λ 1,…,λ m ) is given by
$$\begin{array}{@{}rcl@{}} M_{\mathbf{w}}(\boldsymbol{t}) &=& \frac{M_{\mathbf{x}}(\boldsymbol{t}) - \psi}{1-\psi} \\ &=& \frac{M_{X_{00}}(t_{+})\prod_{i=1}^{m} M_{X_{i0}}(t_{i}) - \psi}{1-\psi} \\ &=& \frac{\left(\pi \mathrm{e}^{t_{+}} + 1 -\pi\right)^{K} \exp\left(\sum_{i=1}^{m} \lambda_{i} \mathrm{e}^{t_{i}}- \lambda_{+}\right) - (1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}}, \end{array} $$

where \(t_{+} = \sum _{i=1}^{m} t_{i}\).

2.3 Marginal distributions

2.3.1 Marginal distribution for each random component

Let w=(W 1,…,W m )ZTCS(K,π;λ 1,…,λ m ). We first derive the marginal distribution of W i with realization w i for i=1,…,m. If w i >0, then
$$\begin{array}{@{}rcl@{}} \Pr(W_{i} = w_{i}) &=& \sum_{w_{1}=0}^{\infty} \cdots \sum_{w_{i-1}=0}^{\infty} \sum_{w_{i+1}=0}^{\infty} \cdots \sum_{w_{m}=0}^{\infty} \Pr(\mathbf{w} = {\boldsymbol{w}}) \\ &=&\frac{1}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{w_{1}=0}^{\infty} \cdots \sum_{w_{i-1}=0}^{\infty} \sum_{w_{i+1}=0}^{\infty} \cdots \sum_{w_{m}=0}^{\infty} \\ & & \times \sum_{k=0}^{\min\{K,w_{i}\}}{K \choose k} \pi^{k} (1 - \pi)^{K-k} \prod_{j=1}^{m} \frac{\lambda_{j}^{w_{j}-k}\mathrm{e}^{-\lambda_{j}}}{(w_{j}-k)!} \cdot I(w_{j}-k>0) \\ &=&\frac{1}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{k=0}^{\min\{K,w_{i}\}}{K \choose k} \pi^{k} (1 - \pi)^{K-k} \frac{\lambda_{i}^{w_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(w_{i}-k)!} \\ & & \times \left[\sum_{w_{1}=0}^{\infty} \cdots \sum_{w_{i-1}=0}^{\infty} \sum_{w_{i+1}=0}^{\infty} \cdots \sum_{w_{m}=0}^{\infty} \prod_{j=1, j \neq i}^{m} \frac{\lambda_{j}^{w_{j}-k}\mathrm{e}^{-\lambda_{j}}}{(w_{j}-k)!} \cdot I(w_{j}-k>0) \right] \\ &=& \frac{1}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{k=0}^{\min\{K,w_{i}\}}{K \choose k}\pi^{k}(1-\pi)^{K-k}\frac{\lambda_{i}^{w_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(w_{i}-k)!} \end{array} $$
(2.8)
$$\begin{array}{@{}rcl@{}} [2mm] &=& \frac{1-\varphi_{i}}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{i}}} \sum_{k=0}^{\min\{K,w_{i}\}} {K \choose k} \pi^{k}(1-\pi)^{K-k} \frac{\lambda_{i}^{w_{i}-k}\mathrm{e}^{-\lambda_{i}} }{(w_{i}-k)!}, \end{array} $$
(2.9)
where
$$ 1-\varphi_{i} = \frac{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{i}}}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}}. $$
(2.10)
Hence,
$$\begin{array}{@{}rcl@{}} \Pr(W_{i} = 0) &=& 1 - \sum_{w_{i}=1}^{\infty}\Pr(W_{i} = w_{i}) \\ [2mm] & \stackrel{(2.8)}{=} & 1 - \frac{1}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{w_{i}=1}^{\infty} \sum_{k=0}^{\min\{K,w_{i}\}}{K \choose k} \pi^{k}(1-\pi)^{K-k} \frac{\lambda_{i}^{w_{i}-k} \mathrm{e}^{-\lambda_{i}} }{(w_{i}-k)!} \\ [2mm] &=& 1 - \frac{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{i}}}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \stackrel{(2.10)}{=} \varphi_{i} \\ [2mm] &=& \frac{(1-\pi)^{K}(\mathrm{e}^{-\lambda_{i}} - \mathrm{e}^{-\lambda_{+}})}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \in \left(0, (1-\pi)^{K} \mathrm{e}^{-\lambda_{i}}\right) \subset (0, 1). \end{array} $$
(2.11)
By combining (2.11) with (2.9) and noting that a ZDCS distribution is a special case of a ZACS distribution (4.2), we obtain
$$ W_{i} \sim \text{ZDCS}(\varphi_{i}; K, \pi, \lambda_{i}). $$
(2.12)

2.3.2 Marginal distribution for an arbitrary random sub-vector

Second, the marginal distribution for an arbitrary random sub-vector will be considered. Before that, a so-called multivariate zero-adjusted Charlier series distribution is needed to be introduced. We will give the definition of this distribution in Definition 2 in Section 4. We now consider the marginal distributions of w (1) and w (2), where
$$\mathbf{w}^{(1)} = \left(\!\!\begin{array}{c} W_{1} \\ \! \vdots \\ W_{r} \end{array} \!\!\right), \quad \mathbf{w}^{(2)} = \left(\!\!\begin{array}{c} W_{r+1} \\ \!\! \vdots \\ W_{m} \end{array} \!\!\right) \quad \text{and} \quad \mathbf{w} = \left(\!\!\begin{array}{c} \mathbf{w}^{(1)} \\ [2mm] \mathbf{w}^{(2)} \end{array} \!\!\right). $$
Furthermore in Section 4, we will introduce multivariate zero-adjusted Charlier series distribution and it can be shown that
$$ \mathbf{w}^{(1)} \sim \text{ZDCS}(\varphi^{(1)}; K,\pi, \lambda_{1}, \ldots, \lambda_{r}) \quad \text{and} \quad \mathbf{w}^{(2)} \sim \text{ZDCS}(\varphi^{(2)}; K,\pi, \lambda_{r+1}, \ldots, \lambda_{m}), $$
(2.13)
where
$$ \varphi^{(i)} = \frac{(1 - \pi)^{K}\left(\mathrm{e}^{-\lambda_{+}^{(i)}} - \mathrm{e}^{ - \lambda_{+}}\right)}{1-(1 - \pi)^{K}\mathrm{e}^{- \lambda_{+}}} \in \Big(0, \; (1 - \pi)^{K}\mathrm{e}^{-\lambda_{+}^{(i)}} \Big) \subset (0, 1), \quad i=1, 2, $$
(2.14)

\(\lambda _{+}^{(1)} = \sum _{i=1}^{r} \lambda _{i}\) and \(\lambda _{+}^{(2)} = \sum _{i=r+1}^{m} \lambda _{i}\).

In fact, for any positive integers i 1,…,i r satisfying 1≤i 1<<i r m, we have
$$ \left(\!\!\begin{array}{c} W_{i_{1}} \\ \!\! \vdots \\ W_{i_{r}} \end{array} \!\!\right) \sim \text{ZDCS}(\varphi^{*}; K, \pi,\lambda_{i_{1}}, \ldots, \lambda_{i_{r}}), $$
(2.15)
where
$$ \varphi^{*} = \frac{(1 - \pi)^{K}\left[\mathrm{e}^{-(\lambda_{i_{1}}+\cdots+\lambda_{i_{r}})} - \mathrm{e}^{- \lambda_{+}}\right]}{1-(1 - \pi)^{K}\mathrm{e}^{- \lambda_{+}}} \in \Big(0, \; (1 - \pi)^{K}\mathrm{e}^{-(\lambda_{i_{1}}+\cdots+\lambda_{i_{r}})} \Big) \subset (0, 1). $$
(2.16)

2.4 Conditional distributions

2.4.1 Conditional distribution of w (1)|w (2)

From (2.4), (2.13) and (4.4), the conditional distribution of w (1)|w (2) is given by
$$\begin{array}{@{}rcl@{}} & & \Pr(\mathbf{w}^{(1)} = {\boldsymbol{w}}^{(1)} | \mathbf{w}^{(2)} = {\boldsymbol{w}}^{(2)}) = \frac{f({\boldsymbol{w}}; K, \pi, \boldsymbol{\lambda}) }{\Pr(\mathbf{w}^{(2)} = {\boldsymbol{w}}^{(2)})} \\ [2mm] &= & \frac{ \frac{1}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \cdot Q_{{\boldsymbol{w}}}(K, \pi, \boldsymbol{ \lambda})} {\varphi^{(2)} I({\boldsymbol{w}}^{(2)}=\mathbf{0}) + \left[ \frac{1-\varphi^{(2)}}{1-(1-\pi)^{K} \mathrm{e}^{-\lambda_{+}^{(2)}}} \cdot Q_{{\boldsymbol{w}}^{(2)}}(K, \pi, \boldsymbol{\lambda}^{(2)})\right] I({\boldsymbol{w}}^{(2)} \ne \mathbf{0})}, \end{array} $$
(2.17)
where w (2)=(w r+1,…,w m ), λ (2)=(λ r+1,…,λ m ) and
$$Q_{{\boldsymbol{w}}}(K, \pi, \boldsymbol{\lambda}) = \sum_{k=0}^{\min\{K,{\boldsymbol{w}}\}} {K \choose k} \pi^{k} (1-\pi)^{K-k} \prod_{j=1}^{m} \frac{\lambda_{j}^{w_{j}-k} \mathrm{e}^{-\lambda_{j}}}{(w_{j}-k)!}, $$
$$Q_{{\boldsymbol{w}}^{(2)}}\left(K, \pi, \boldsymbol{\lambda}^{(2)}\right) = \sum_{l=0}^{\min\{K,{\boldsymbol{w}}^{(2)}\}} {K \choose l} \pi^{l} (1- \pi)^{K-l} \prod_{p=r+1}^{m} \frac{\lambda_{p}^{w_{p}-l} \mathrm{e}^{-\lambda_{p}}}{(w_{p}-p)!}. $$
We first consider Case I: w (2)0. Under Case I, it is possible that w (1)=0 or w (1)0. From (2.17), it is easy to obtain
$$ \Pr\left(\mathbf{w}^{(1)}={\boldsymbol{w}}^{(1)} | \mathbf{w}^{(2)}={\boldsymbol{w}}^{(2)}\right) = \frac{ \mathrm{e}^{-\lambda_{+}^{(1)}} \sum\limits_{k=0}^{\min\{K,{\boldsymbol{w}}\}}{K \choose k}\pi^{k}(1-\pi)^{K-k}\prod\limits_{j=1}^{m}\frac{\lambda_{j}^{w_{j}-k}}{(w_{j}-k)!}}{ \sum\limits_{l=0}^{\min\{K,{\boldsymbol{w}}^{(2)}\}}{K \choose l}\pi^{l}(1-\pi)^{K-l}\prod\limits_{p=r+1}^{m}\frac{\lambda_{p}^{w_{p}-l}}{(w_{p}-l)!}}. $$
(2.18)
Case I I: w (2)=0. Under Case I I, it is obviously that w (1)0 and the sharing binomial variable equals to zero. Thus we have
$$\Pr\left(\mathbf{w}^{(1)} = {\boldsymbol{w}}^{(1)} | \mathbf{w}^{(2)} = \mathbf{0} \right) = \frac{1}{1-\mathrm{e}^{-\lambda_{+}^{(1)}}}\prod_{i=1}^{r} \frac{\lambda_{i}^{w_{i}}\mathrm{e}^{-\lambda_{i}}}{w_{i}!}. $$
This implies
$$ \mathbf{w}^{(1)} | \left(\mathbf{w}^{(2)} = \mathbf{0}\right) \sim \text{ZTP}^{\text{(I)}}(\lambda_{1}, \ldots, \lambda_{r}). $$
(2.19)

2.4.2 Conditional distribution of \(X_{0}^{*}|(\textbf {w},U)\)

The stochastic representation (2.2) can be rewritten as
$$(X_{1}, \ldots, X_{m})^{\!\top\!} = (X_{0}^{*} + X_{1}^{*}, \ldots, X_{0}^{*} + X_{m}^{*})^{\!\top\!} \stackrel{\mathrm{d}}{=} U\mathbf{w}, $$
where \(X_{0}^{*} \sim \text {Binomial}(K, \pi)\) and \(\{X_{i}^{*}\}_{i=1}^{m} \stackrel {\text {ind}}{\sim } \text {Poisson}(\lambda _{i})\). To obtain the conditional distribution of \(X_{0}^{*} | (\mathbf {w}, U)\), we consider two cases: U=1 and U=0. When U=1, the conditional distribution of \(X_{0}^{*} | (\mathbf {w}, U)\) is given by
$$\begin{array}{@{}rcl@{}} \Pr(X_{0}^{*}=l|\mathbf{w}={\boldsymbol{w}},U=1) &=& \frac{\Pr(X_{0}^{*}=l,X_{1}^{*}=w_{1}-l,\ldots,X_{m}^{*}=w_{m}-l)} {\Pr(X_{1} =w_{1}, \ldots, X_{m} = w_{m})} \\ [4mm] &=& \frac{ {K \choose l} \pi^{l} (1-\pi)^{K - l} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{w_{i}-l}} {(w_{i} - l)!}}{ \sum\limits_{k=0}^{\min(K, {\boldsymbol{w}})} {K \choose k} \pi^{k} (1-\pi)^{K-k} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{w_{i}-k}}{(w_{i}-k)!}} \\ [2mm] &\;\hat{=}\; & q_{l}({\boldsymbol{w}}, K,\pi, \boldsymbol{\lambda}), \end{array} $$
(2.20)
for l=0,1,…, min(K,w), which implying1
$$ X_{0}^{*}|(\mathbf{w}={\boldsymbol{w}}, U=1) \sim \text{Finite}(l, q_{l}({\boldsymbol{w}}, K, \pi, \boldsymbol{\lambda}); \ l=0,1, \ldots, \min({\boldsymbol{w}})). $$
(2.21)
When U=0, we obtain \(\Pr (X_{0}^{*}=0|\mathbf {w}={\boldsymbol {w}},U=0)=1\), i.e.,
$$ X_{0}^{*} | (\mathbf{w}={\boldsymbol{w}}, U=0) \sim \text{Degenerate}(0). $$
(2.22)
Hence, for any l, we have
$$ \Pr(X_{0}^{*}=l|\mathbf{w}={\boldsymbol{w}},U=0)=I(l=0). $$
(2.23)
Thus, we have the conditional distribution of \(X_{0}^{*}|(\mathbf {w},U)\), which is given by the following:
$$\begin{array}{@{}rcl@{}} X_{0}^{*} | (\mathbf{w}, U) \sim \left\{ \begin{array}{ll} \text{Finite}(l, q_{l}({\boldsymbol{w}}, K,\pi, \boldsymbol{\lambda}); \ l=0,1, \ldots, \min({\boldsymbol{w}})), & \text{if}~~ U = 1, \\ \text{Degenerate}(0), & \text{if}~~ U = 0, \end{array} \right. \end{array} $$
(2.24)

where q l (w,K,π,λ) is defined by (2.20).

2.4.3 Conditional distribution of \(X_{0}^{*}|\textbf {w}\)

By using (2.24), the conditional distribution of \(X_{0}^{*} |\mathbf {w} \) is
$$\begin{array}{@{}rcl@{}} \Pr(X_{0}^{*}=l | \mathbf{w}={\boldsymbol{w}}) &=& \sum_{u=0}^{1} \Pr(X_{0}^{*} = l, U=u|\mathbf{w}={\boldsymbol{w}}) \\ [2mm] &=& \sum_{u=0}^{1} \Pr(U=u | \mathbf{w}={\boldsymbol{w}}) \cdot \Pr(X_{0}^{*} = l |\mathbf{w}={\boldsymbol{w}}, U=u) \\ [2mm] &=& \Pr(U=0) \cdot \Pr(X_{0}^{*} = l |\mathbf{w}={\boldsymbol{w}}, U=0) \\ [2mm] & & + \; \Pr(U=1) \cdot \Pr(X_{0}^{*} = l |\mathbf{w}={\boldsymbol{w}}, U=1) \\ [2mm] &\stackrel{(2.24)}{=} & \mathrm{e}^{- \lambda_{+}}(1 - \pi)^{K} \cdot I(l=0) + [\!1-\mathrm{e}^{- \lambda_{+}}(1 - \pi)^{K}] \cdot\, q_{l}({\boldsymbol{w}}, K, \pi, \boldsymbol{\lambda}) \\ [2mm] &\;\hat{=}\; & p_{l}({\boldsymbol{w}}, K,\pi, \boldsymbol{\lambda}), \end{array} $$
(2.25)
for l=0,1,…, min(K,w), where q l (w,K,π,λ) is defined by (2.20). Thus,
$$ X_{0}^{*} |(\mathbf{w}={\boldsymbol{w}}) \sim \text{Finite}(l, p_{l}({\boldsymbol{w}}, K,\pi, \boldsymbol{\lambda}); \ l=0,1, \ldots, \min(K,{\boldsymbol{w}})). $$
(2.26)
Especially, when min(K,w)=0, we have \(X_{0}^{*} | (\mathbf {w}={\boldsymbol {w}}) \sim \text {Degenerate}(0)\). Thus, the conditional expectation of \(X_{0}^{*} |\mathbf {w}\) is given by
$$\begin{array}{@{}rcl@{}} E(X_{0}^{*}| \mathbf{w}={\boldsymbol{w}}) &=& \left[1-\mathrm{e}^{- \lambda_{+}}(1 - \pi)^{K}\right] \\ [2mm] & & \times \; \frac{ \sum\limits_{k=1}^{\min(K,{\boldsymbol{w}})} k {K \choose k} \pi^{k} (1 - \pi)^{K-k} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{w_{i}-k}} {(w_{i} - k)!}}{ \sum\limits_{k=0}^{\min(K, {\boldsymbol{w}})} {K \choose k} \pi^{k} (1 - \pi)^{K - k} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{w_{i}-k}}{(w_{i}-k)!}} \cdot I(\min({\boldsymbol{w}}) \ge 1). \qquad \end{array} $$
(2.27)

Likelihood-based methods for the multivariate ZTCS distribution

Suppose that w j indZTCS(K,π;λ 1,…,λ m ), where w j =(W 1j ,…,W mj ) for j=1,…,n. Let w j =(w 1j ,…,w mj ) denote the realization of the random vector w j , and Y obs=\( \{{\boldsymbol {w}}_{j}\}_{j=1}^{n}\) be the observed data. We consider K as a known positive integer. Then, the observed-data likelihood function for (π,λ) is
$$L(\pi,{\boldsymbol{\lambda}}|Y_{\text{obs}}) = \prod_{j=1}^{n} \frac{\mathrm{e}^{-\lambda_{+}}}{1 - (1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{k=0}^{\min (K,{\boldsymbol{w}}_{j})} {K \choose k}\pi^{k}(1-\pi)^{K-k} \prod_{i=1}^{m} \frac{\lambda_{i}^{w_{ij}-k}}{(w_{ij}-k)!}, $$
so that the log-likelihood function is
$$\begin{array}{@{}rcl@{}} \ell(\pi, \boldsymbol{\lambda}|Y_{\text{obs}}) &=& - n \log\left[\mathrm{e}^{\lambda_{+}} -(1-\pi)^{K}\right] \\ [2mm] & & + \; \sum_{j=1}^{n} \log \left[ \sum_{k=0}^{\min (K,{\boldsymbol{w}}_{j})} {K \choose k} \pi^{k} (1-\pi)^{K-k} \prod_{i=1}^{m} \frac{\lambda_{i}^{w_{ij}-k}}{(w_{ij}-k)!} \right]. \end{array} $$
(3.1)

3.1 MLEs via the EM algorithm

The SR (2.2) can motivate a novel EM algorithm, where some latent variables are independent of the observed variables. For each w j =(w 1j ,…,w mj ), we introduce latent variables U j iidBernoulli(1−ψ) with \(\psi = (1-\pi)^{K}\mathrm {e}^{- \lambda _{+}}, X_{0j}^{*} \stackrel {\text {iid}}{\sim } \text {Binomial}(K,\pi), X_{\textit {ij}}^{*} \stackrel {\text {iid}}{\sim } \text {Poisson}(\lambda _{i})\) for i=1,…,m, and \(X_{0j}^{*} {\bot \!\!\!\!\bot } X_{\textit {ij}}^{*}\), such that
$$\left(x_{0j}^{*}+ x_{1j}^{*}, \ldots, x_{0j}^{*}+ x_{mj}^{*}\right)^{\!\top\!} = u_{j} {\boldsymbol{w}}_{j}, $$
where u j and \(x_{\textit {ij}}^{*}\) denote the realizations of U j and \(x_{\textit {ij}}^{*}\), respectively. We denote the latent/missing data by \(Y_{\text {mis}}=\left \{u_{j}, x_{0j}^{*}, x_{1j}^{*}, \ldots, x_{\textit {mj}}^{*}\right \}_{j=1}^{n}\), so that the complete data are
$$\begin{array}{@{}rcl@{}} Y_{\text{com}} &=& Y_{\text{obs}} \cup Y_{\text{mis}} = \left\{{\boldsymbol{w}}_{j}, u_{j}, x_{0j}^{*}, x_{1j}^{*}, \ldots, x_{mj}^{*}\right\}_{j=1}^{n} \\ [2mm] &=& \left\{x_{0j}^{*}, x_{1j}^{*}, \ldots, x_{mj}^{*}\right\}_{j=1}^{n} = \left\{x_{0j}^{*}, u_{j}, {\boldsymbol{w}}_{j}\right\}_{j=1}^{n}, \end{array} $$
where \(x_{\textit {ij}}^{*} = u_{j} w_{\textit {ij}} - x_{0j}^{*}\) for j=1,…,n and i=1,…,m. Thus, the complete-data likelihood function is given by
$$\begin{array}{@{}rcl@{}} L(\pi, {\boldsymbol{\lambda}}|Y_{\text{com}}) &=& \prod_{j=1}^{n} \left[ {K \choose x_{0j}^{*}} \pi^{x_{0j}^{*}} (1 - \pi)^{K - x_{0j}^{*}} \prod_{i=1}^{m} \frac{\lambda_{i}^{x_{ij}^{*} }\mathrm{e}^{-\lambda_{i}}}{x_{ij}^{*}!} \right] \\ &=& \prod_{j=1}^{n} \left[ {K \choose x_{0j}^{*}} \pi^{x_{0j}^{*}} (1 - \pi)^{K - x_{0j}^{*}} \prod_{i=1}^{m} \frac{\lambda_{i}^{u_{j} w_{ij}-x_{0j}^{*}} \mathrm{e}^{-\lambda_{i}}}{\left(u_{j} w_{ij}-x_{0j}^{*}\right)!} \right] \\ &\propto& \pi^{n\bar{x}_{0}^{*}}(1 - \pi)^{nK - n\bar{x}_{0}^{*}}\prod_{i=1}^{m} \lambda_{i}^{\sum_{j=1}^{n} u_{j} w_{ij} - n\bar{x}_{0}^{*}} \mathrm{e}^{- n\lambda_{i}}, \end{array} $$
(3.2)
where \(\bar {x}_{0}^{*} = (1/n)\sum _{j=1}^{n} x_{0j}^{*}\). The complete-data log-likelihood function is
$$\ell(\pi, \boldsymbol{\lambda}|Y_{\text{com}}) = n\bar{x}_{0}^{*}\log\pi + \left(nK - n \bar{x}_{0}^{*}\right)\log(1 - \pi) + \sum_{i=1}^{m} \left[\!\left(\sum_{j=1}^{n} u_{j} w_{ij} - n\bar{x}_{0}^{*} \right) \log\lambda_{i} - n\lambda_{i} \right]. $$
The M-step is to calculate the complete-data maximum likelihood estimates (MLEs):
$$ \hat{\pi} = \frac{\bar{x}_{0}^{*}}{K} \quad \text{and} \quad \hat{\lambda}_{i} = \frac{\sum_{j=1}^{n} u_{j} w_{ij} }{n} - K \hat{\pi}, \qquad i=1,\ldots,m, $$
(3.3)
and the E-step is to replace \(\{u_{j}\}_{j=1}^{n}\) and \(\left \{ x_{0j}^{*} \right \}_{j=1}^{n}\) in (3.3) by their conditional expectations:
$$\begin{array}{@{}rcl@{}} E(U_{j}|Y_{\text{obs}},\pi, \boldsymbol{\lambda}) &=& E(U_{j}) = 1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}, \quad \text{and} \quad \end{array} $$
(3.4)
$$\begin{array}{@{}rcl@{}} [2mm] E(X_{0j}^{*}|Y_{\text{obs}},\pi,\boldsymbol{\lambda}) &\stackrel{(2.27)}{=} & \frac{ [\!1-(1-\pi)^{K}\mathrm{e}^{- \lambda_{+}}] \!\! \sum\limits_{k_{j}=1}^{\min(K,{\boldsymbol{w}}_{j})} \!\! k_{j} {K \choose k_{j}} \pi^{k_{j}} (1 - \pi)^{K - k_{j}} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{w_{ij}-k_{j}}} {(w_{ij}-k_{j})!}} {\sum\limits_{k_{j}=0}^{\min(K, {\boldsymbol{w}}_{j})} {K \choose k_{j}} \pi^{k_{j}} (1 - \pi)^{K-k_{j}} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{w_{ij}-k_{j}}} {(w_{ij}-k_{j})!}} \\ [2mm] & & \times \; I(\min({\boldsymbol{w}}_{j}) \ge 1), \end{array} $$
(3.5)

respectively. An important feature of this EM algorithm is that the latent variables \(\{u_{j}\}_{j=1}^{n}\) are independent of the observed variables \(\{\mathbf {w}_{j}\}_{j=1}^{n}\).

Also note that here we assume that K is a known positive integer. In practice, since K{1,2,…,N}, say N=100. For a given K, we first use the EM iteration (3.3)–(3.5) to find the MLEs of π and λ, denoted by \(\hat {\pi }\) and \(\hat {\boldsymbol {\lambda }}\). Then, we can calculate \(\ell (\hat {\pi }, \hat {\boldsymbol {\lambda }} | Y_{\text {obs}})\) and choose the K that maximizes \(\ell (\hat {\pi }, \hat {\boldsymbol {\lambda }} | Y_{\text {obs}})\).

3.2 Bootstrap confidence intervals

When other approaches are not available, the bootstrap method is a useful tool to find confidence intervals (CIs) for an arbitrary function of (π,λ), say, 𝜗=h(π,λ). Let \((\hat {\pi }, \hat {\boldsymbol {\lambda }})\) be the MLEs of (π,λ) calculated by the EM algorithm (3.3)–(3.5), then \(\hat {\vartheta } = h(\hat {\pi },\hat {\boldsymbol {\lambda }})\) is the MLE of 𝜗. Based on \((\hat {\pi },\hat {\boldsymbol {\lambda }})\), we can generate \(\mathbf {w}_{j}^{*} \stackrel {\text {iid}}{\sim } \text {ZTCS}(K,\hat {\pi },\hat {\lambda }_{1}, \ldots, \hat {\lambda }_{m})\) via the SR (2.2) for j=1,…,n. Having obtained \(Y_{\text {obs}}^{*} = \left \{{\boldsymbol {w}}_{1}^{*}, \ldots, {\boldsymbol {w}}_{n}^{*}\right \}\), we can calculate the bootstrap replication \((\hat {\pi }^{*},\hat {\boldsymbol {\lambda }}^{*})\) and get \(\hat {\vartheta }^{*} = h(\hat {\pi }^{*},\hat {\boldsymbol {\lambda }}^{*})\). Independently repeating this process G times, we obtain G bootstrap replications \(\left \{\hat {\vartheta }_{g}^{*}\right \}_{g=1}^{G}\). Consequently, the standard error, \(\text {se}(\hat {\vartheta })\), of \(\hat {\vartheta }\) can be estimated by the sample standard deviation of the G replications, i.e.,
$$ \widehat{\text{se}}(\hat{\vartheta}) = \left\{ \frac{1}{G-1}\sum_{g=1}^{G} \left[\hat{\vartheta}_{g}^{*} - (\hat{\vartheta}_{1}^{*} + \cdots + \hat{\vartheta}_{G}^{*})/G\right]^{2} \right\}^{1/2}. $$
(3.6)
If \(\left \{\hat {\vartheta }_{g}^{*}\right \}_{g=1}^{G}\) is approximately normally distributed, the first (1−α)100 % bootstrap CI for 𝜗 is
$$ \left[ \hat{\vartheta} - z_{\alpha/2} \cdot \widehat{\text{se}}(\hat{\vartheta}),\; \hat{\vartheta} + z_{\alpha/2} \cdot \widehat{\text{se}}(\hat{\vartheta}) \right]. $$
(3.7)
Alternatively, if \(\left \{\hat {\vartheta }_{g}^{*}\right \}_{g=1}^{G}\) is non-normally distributed, the second (1−α)100 % bootstrap CI of 𝜗 can be obtained as
$$ [\hat{\vartheta}_{_{\mathrm{L}}},\; \hat{\vartheta}_{_{\mathrm{U}}}], $$
(3.8)

where \(\hat {\vartheta }_{_{\mathrm {L}}}\) and \(\hat {\vartheta }_{_{\mathrm {U}}}\) are the 100(α/2) and 100(1−α/2) percentiles of \(\left \{\hat {\vartheta }_{g}^{*}\right \}_{g=1}^{G}\), respectively.

Multivariate zero-adjusted Charlier series distribution

To introduce the multivariate zero-adjusted Charlier series (ZACS) distribution, we first define the univariate ZACS distribution. A non-negative discrete random variable Y is said to have a ZACS distribution with parameters φ [ 0,1) and λ>0, denoted by YZACS(φ,K,π,λ), if
$$ Y \stackrel{\mathrm{d}}{=} Z' W, $$
(4.1)
where Z Bernoulli(1−φ), WZTCS(K,π,λ), and Z W. It is clear that the pmf of Y is given by
$$ \Pr(Y=y) = \varphi I(y=0) + \left[\frac{1-\varphi} {1-(1-\pi)^{K} \mathrm{e}^{- \lambda}} \cdot Q_{y}(K, \pi, \lambda) \right] I(y \ne 0), $$
(4.2)
where
$$ Q_{y}(K, \pi, \lambda) = \sum_{k=0}^{\min(K, \, y)} {K \choose k} \pi^{k} (1 - \pi)^{K-k} \frac{\lambda^{y-k} \mathrm{e}^{-\lambda} }{(y-k)!}. $$
Motivated by (4.1), naturally, we have the following multivariate generalization.

Definition 2.

A discrete random vector y=(Y 1,…,Y m ) is said to have the multivariate ZACS distribution with parameters φ [ 0,1), K>0, π [ 0,1) and λ=(λ 1,…,\(\lambda _{m})^{\!\top \!} \in {\mathbb {R}}_{+}^{m}\), denoted by yZACS m (φ ;K,π,λ) or yZACS(φ ;K,π,λ 1,…,λ m ), if
$$ \mathbf{y} \stackrel{\mathrm{d}}{=} Z' \mathbf{w} = \left\{ \begin{array}{ll} {\mathbf{0}},&{\text{with probability}\, \varphi},\\ {\mathbf{w}},&{\text{with probability}\, 1-\varphi,} \end{array}\right. $$
(4.3)

where Z Bernoulli(1−φ), wZTCS(K,π;λ 1,…,λ m ), and Z w. The random vector w is called the base vector of the y.

It is easy to show that the joint pmf of yZACS(φ ;K,π,λ 1,…,λ m ) is
$$ \Pr(\textbf{y}= {\boldsymbol{y}}) = \varphi I({\boldsymbol{y}}=\textbf{0}) + \left[ \frac{1-\varphi}{1-(1-\pi)^{K} \mathrm{e}^{- \lambda_{+}}} \cdot Q_{{\boldsymbol{y}}}(K, \pi, \boldsymbol{ \lambda}) \right] I({\boldsymbol{y}} \ne \textbf{0}), $$
(4.4)
where
$$Q_{{\boldsymbol{y}}}(K, \pi, {\boldsymbol{\lambda}}) = \sum_{k=0}^{\min(K,{\boldsymbol{y}})} {K \choose k} \pi^{k} (1 - \pi)^{K-k} \prod_{i=1}^{m} \frac{\lambda_{i}^{y_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(y_{i}-k)!}. $$
We consider several special cases of (4.3) or (4.4):
  1. (i)

    If φ=0, then y=dwZTCS(K,π;λ 1,…,λ m ), i.e., the multivariate ZTCS distribution is a special member of the family of the multivariate ZACS distributions. Thus, we can see that studying the multivariate ZTCS distribution is a basis for studying the multivariate ZACS distribution;

     
  2. (ii)

    If \(\varphi \in (0, \, (1-\pi)^{K}\mathrm {e}^{{- \lambda }_{+}})\), then y follows the multivariate zero-deflated Charlier series (ZDCS) distribution with parameters (φ,K,π,λ), denoted by yZDCS m (φ;K,π,λ) or yZDCS(φ;K,π,λ 1,…,λ m );

     
  3. (iii)

    If \(\varphi = (1-\pi)^{K}\mathrm {e}^{{- \lambda }_{+}}\), then yCS m (K,π;λ);

     
  4. (iv)

    If \(\varphi \in ((1-\pi)^{K}\mathrm {e}^{{- \lambda }_{+}}, \, 1)\), then y follows the multivariate zero-inflated Charlier series (ZICS) distribution with parameters (φ,K,π,λ), denoted by yZICS m (φ ;K,π,λ) or yZICS(φ ;K,π,λ 1,…,λ m ).

     

4.1 Mixed moments and moment generating function

From (4.1) and (2.2), we immediately have
$$ \left\{ \begin{array}{lll} E(\textbf{y}) &=& \frac{1-\varphi}{1-\psi}(\boldsymbol{\lambda} + K \pi \cdot \textbf{1}\!\!\!\textbf{1}), \\ [4mm] E(\textbf{y}\textbf{y}^{\!\top\!}) &=& \frac{1-\varphi}{1-\psi}\left[\text{diag}(\boldsymbol{\lambda}) + \boldsymbol{\lambda} \boldsymbol{\lambda}^{\!\top\!} + K \pi (\boldsymbol{\lambda} \textbf{1}\!\!\!\textbf{1}^{\!\top\!} + \textbf{1}\!\!\!\textbf{1}\boldsymbol{\lambda}^{\!\top\!}) + K\pi(1- \pi + K\pi) \cdot \textbf{1}\!\!\!\textbf{1} \textbf{1}\!\!\!\textbf{1}^{\!\top\!} \, \right], \\ [4mm] \text{Var}(\textbf{y}) &=& \frac{1-\varphi}{1-\psi} \Bigg\{ \text{diag}(\boldsymbol{\lambda}) + K\pi(1 - \pi) \cdot \textbf{1}\!\!\!\textbf{1} \textbf{1}\!\!\!\textbf{1}^{\!\top\!} \\ [4mm] & & - \; \frac{\psi - \varphi}{1-\psi}\left[\boldsymbol{\lambda} \boldsymbol{\lambda}^{\!\top\!} + K \pi ({\lambda} \textbf{1}\!\!\!\textbf{1}^{\!\top\!} + \textbf{1}\!\!\!\textbf{1}\boldsymbol{ \lambda}^{\!\top\!}) + K^{2}\pi^{2} \textbf{1}\!\!\!\textbf{1}\textbf{1}\!\!\!\textbf{1}^{\!\top\!} \, \right] \Bigg\}. \end{array} \right. $$
(4.5)
Thus, we have
$${\fontsize{9}{6} \begin{aligned} \text{Corr}(Y_{i}, Y_{j}) = \frac{K\pi(1-\pi) - (\lambda_{i}+K\pi)(\lambda_{j}+K\pi)(\psi-\varphi)/(1-\psi)}{\sqrt{\left[\lambda_{i} + K\pi(1-\pi) - \frac{\psi-\varphi}{1-\psi}(\lambda_{i} + K\pi)^{2} \right] \left[\lambda_{j} + K\pi(1-\pi) - \frac{\psi-\varphi}{1-\psi}(\lambda_{j}+K\pi)^{2} \right]}}\,, \end{aligned}} $$
for ij. In particular, if π=0, we obtain
$$\text{Corr}(Y_{i}, Y_{j}) = \frac{\lambda_{i}\lambda_{j}(\varphi-\psi)/(1-\psi) }{ \sqrt{\left[ \lambda_{i} - {\lambda_{i}^{2}} (\psi-\varphi)/(1-\psi) \right] \left[ \lambda_{j} - {\lambda_{j}^{2}} (\psi-\varphi)/(1-\psi) \right] }}. $$
Furthermore, if λ i =λ j =λ, then
$$\text{Corr}(Y_{i}, Y_{j}) = \frac{\lambda(\varphi-\psi)/(1-\psi) }{[\!1 - \lambda (\psi-\varphi)/(1-\psi) ]}. $$
Clearly, Corr(Y i ,Y j ) could be either positive or negative, which depend on the values of φ, K, π and λ.
For any r 1,…,r m ≥0, the mixed moments of y are given by
$$ E\left(\prod_{i=1}^{m} Y_{i}^{r_{i}} \right) = (1-\varphi)E\left(\prod_{i=1}^{m} W_{i}^{r_{i}} \right) = \frac{1-\varphi}{1-\psi}E\left(\prod_{i=1}^{m} X_{i}^{r_{i}}\right). $$
(4.6)
By using the formula of E(ξ)=E[E(ξ|Z )], the mgf of y is
$$\begin{array}{@{}rcl@{}} M_{\textbf{y}}(\boldsymbol{t}) &=& E[\exp(\boldsymbol{t}^{\!\top\!} \textbf{y})] = E[\exp(Z' \cdot \boldsymbol{ t}^{\!\top\!} \textbf{w})] = E\Big\{ E[\exp(Z' \boldsymbol{t}^{\!\top\!} \textbf{w}) | Z'] \Big\} \\ [2mm] &=& E[M_{\textbf{w}}(Z' \boldsymbol{t})] = \varphi M_{\textbf{w}}(\textbf{0}) + (1-\varphi) M_{\textbf{w}}(\boldsymbol{t}) = \varphi + (1-\varphi)M_{\textbf{w}}(\boldsymbol{t}) \\ [2mm] &=& \varphi + \frac{1-\varphi}{1-\psi}\left[(\pi \mathrm{e}^{t_{+}} + 1 -\pi)^{K} \exp \left(\sum_{i=1}^{m} \lambda_{i} \mathrm{e}^{t_{i}}- \lambda_{+} \right) - (1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}\right], \qquad \end{array} $$
(4.7)

where \(t_{+} = \sum _{i=1}^{m} t_{i}\).

4.2 Marginal distributions

Now we consider the marginal distributions of y (1) and y (2), where
$$\textbf{y}^{(1)} = \left(\!\!\begin{array}{c} Y_{1} \\ \! \vdots \\ Y_{r} \end{array} \!\!\right), \quad \textbf{y}^{(2)} = \left(\!\!\begin{array}{c} Y_{r+1} \\ \!\! \vdots \\ Y_{m} \end{array} \!\!\right) \quad \text{and} \quad \textbf{y} = \left(\!\!\begin{array}{c} \textbf{y}^{(1)} \\ [2mm] \textbf{y}^{(2)} \end{array} \!\!\right). $$
Based on (4.1) and (2.13), we have
$$\textbf{y}^{(k)} \stackrel{\mathrm{d}}{=} Z'\textbf{w}^{(k)} \stackrel{\mathrm{d}}{=} Z' Z^{(k)}\boldsymbol{ \xi}^{(k)}, \quad k=1,2, $$
where Z Bernoulli(1−φ), Z (k)Bernoulli(1−φ (k)), φ (k) is given by (2.14), ξ (1)ZTCS(K,π;λ 1,…,λ r ) and ξ (2)ZTCS(K,π;λ r+1,…,λ m ). Note that Z Z (k) ξ (k) and Z Z (k)Bernoulli((1−φ)(1−φ (k))). According to the SR (4.3), we can obtain
$$ \textbf{y}^{(1)} \sim \text{ZACS}(\nu^{(1)}; K, \pi, \lambda_{1}, \ldots, \lambda_{r}) \quad \text{and} \quad \textbf{y}^{(2)} \sim \text{ZACS}(\nu^{(2)}; K, \pi, \lambda_{r+1}, \ldots, \lambda_{m}), $$
(4.8)
where
$$ \nu^{(k)} = 1- (1-\varphi)(1-\varphi^{(k)}) = 1 - (1 - \varphi)\frac{1-(1 - \pi)^{K}\mathrm{e}^{-\lambda_{+}^{(k)}} }{1-(1-\pi)^{K}\mathrm{e}^{- \lambda_{+}}} \in (0, 1), \quad k=1, 2, $$
(4.9)

\(\lambda _{+}^{(1)} = \sum _{i=1}^{r} \lambda _{i}\) and \(\lambda _{+}^{(2)} = \sum _{i=r+1}^{m} \lambda _{i}\).

In fact, for any positive integers i 1,…,i r satisfying 1≤i 1<<i r m, we have
$$ \left(\!\!\begin{array}{c} Y_{i_{1}} \\ \!\! \vdots \\ Y_{i_{r}} \end{array} \!\!\right) \sim \text{ZACS}(\nu^{*}; K,\pi, \lambda_{i_{1}}, \ldots, \lambda_{i_{r}}), $$
(4.10)
where φ is given by (2.16) and
$$ \nu^{*} =1- (1-\varphi)(1-\varphi^{*}) = 1 - (1-\varphi)\frac{1-(1-\pi)^{K}\mathrm{e}^{-(\lambda_{i_{1}}+\cdots+\lambda_{i_{r}})} }{1-(1-\pi)^{K}\mathrm{e}^{- \lambda_{+}}} \in (0, 1). $$
(4.11)

4.3 Conditional distributions

4.3.1 Conditional distribution of y (1)|y (2)

From (4.4) and (4.8), the conditional distribution of y (1)|y (2) is given by
$$\begin{array}{@{}rcl@{}} \Pr(\textbf{y}^{(1)} = {\boldsymbol{y}}^{(1)} | \textbf{y}^{(2)} = {\boldsymbol{y}}^{(2)}) &= & \frac{\Pr(\textbf{y} = {\boldsymbol{y}}) }{\Pr(\textbf{y}^{(2)} = {\boldsymbol{y}}^{(2)})} \\ [2mm] &= & \frac{\varphi I({\boldsymbol{y}}=\textbf{0}) + R({\boldsymbol{y}}, K, \pi, \boldsymbol{ \lambda}, \varphi) I({\boldsymbol{y}} \ne \textbf{0})}{\nu^{(2)} I({\boldsymbol{y}}^{(2)}=\textbf{0}) + S({\boldsymbol{y}}^{(2)}, K, \pi, {\lambda}, \nu^{(2)}) I({\boldsymbol{y}}^{(2)} \ne \textbf{0})}. \qquad \end{array} $$
(4.12)
where
$$\begin{array}{@{}rcl@{}} R({\boldsymbol{y}}, K, \pi, \boldsymbol{\lambda}, \varphi) &=& \frac{1-\varphi}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \sum_{k=0}^{\min(K,{\boldsymbol{y}})} {K \choose k} \pi^{k} (1-\pi)^{K-k} \prod_{i=1}^{m} \frac{\lambda_{i}^{y_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(y_{i}-k)!} \;\text{and} \quad \\ [2mm] S\left({\boldsymbol{y}}^{(2)}, K, \pi, \boldsymbol{\lambda}, \nu^{(2)}\right) &=& \frac{1-\nu^{(2)}} {1- (1-\pi)^{K}\mathrm{e}^{-\lambda_{+}^{(2)}}} \sum_{k=0}^{\min(K,{\boldsymbol{y}}^{(2)})} {K \choose k} \pi^{k} (1-\pi)^{K-k} \prod_{i=r+1}^{m} \frac{\lambda_{i}^{y_{i}-k} \mathrm{e}^{-\lambda_{i}}}{(y_{i}-k)!}. \end{array} $$
We first consider Case I: y (2)0. Under Case I, it is clear that y0. From (4.12), it is easy to obtain
$$\Pr\left(\textbf{y}^{(1)} = {\boldsymbol{y}}^{(1)} | \textbf{y}^{(2)} = {\boldsymbol{y}}^{(2)}\right) = \frac{ \mathrm{e}^{-\lambda_{+}^{(1)}} \sum\limits_{k=0}^{\min\{K,{\boldsymbol{y}}\}}{K \choose k}\pi^{k}(1-\pi)^{K-k}\prod\limits_{j=1}^{m}\frac{\lambda_{j}^{y_{j}-k}}{(y_{j}-k)!}}{ \sum\limits_{l=0}^{\min\{K,{\boldsymbol{y}}^{(2)}\}}{K \choose l}\pi^{l}(1-\pi)^{K-l}\prod\limits_{p=r+1}^{m}\frac{\lambda_{p}^{y_{p}-l}}{(y_{p}-l)!}}. $$
Case I I: y (2)=0. Under Case I I, it is possible that y (1)=0 or y (1)0. When y (1)=0, from (4.12), we obtain
$$\Pr(\textbf{y}^{(1)} = \textbf{0} | \textbf{y}^{(2)} = \textbf{0}) = \frac{\varphi}{\nu^{(2)}}. $$
When y (1)0, from (4.12), we have
$$ \Pr(\textbf{y}^{(1)} = {\boldsymbol{y}}^{(1)} | \textbf{y}^{(2)} = \textbf{0}) = \frac{(1-\varphi)(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}^{(2)}}}{\nu^{(2)}(1-(1-\pi)^{K}\mathrm{e}^{- \lambda_{+}})} \prod_{i=1}^{r} \frac{\lambda_{i}^{y_{i}}\mathrm{e}^{-\lambda_{i}}}{y_{i}!}. $$

4.3.2 Conditional distribution of Z |y

Since Z Bernoulli(1−φ), Z only takes the value 0 or 1. Note that y=0 is equivalent to Z =0. Thus, Pr(Z =0|y=0)= Pr(Z =0)/ Pr(y=0)=1. And when y0, we have Pr(Z =1|y=y)= Pr(Z =1,w=y)/ Pr(y=y)=1. Therefore,
$$ Z'|(\textbf{y}={\boldsymbol{y}}) \sim \left\{ \begin{array}{ll} \text{Degenerate}(0), & \text{if}\ {\boldsymbol{y}}=\textbf{0}, \\ [4mm] \text{Degenerate}(1), & \text{if}\ {\boldsymbol{y}} \ne \textbf{0}, \end{array} \right. $$
(4.13)

i.e., Z |(y=y)Degenerate(I(y0)).

4.3.3 Conditional distribution of w|(y=y0)

If y0, we have
$$\Pr(\textbf{w}={\boldsymbol{w}} | \textbf{y} = {\boldsymbol{y}}) = \frac{\Pr(\textbf{w}={\boldsymbol{w}}, \textbf{y} = {\boldsymbol{y}})}{\Pr(\textbf{y} = {\boldsymbol{y}})} = \frac{\Pr(\textbf{w}={\boldsymbol{y}}, Z'=1)}{\Pr(\textbf{y} = {\boldsymbol{y}})} = I({\boldsymbol{w}}={\boldsymbol{y}}). $$
Thus, given y=y0, we have
$$ \textbf{w}|(\textbf{y} = {\boldsymbol{y}} \ne \textbf{0}) \sim \text{Degenerate}({\boldsymbol{y}}). $$
(4.14)

Likelihood-based methods for multivariate ZACS distribution without covariates

Suppose that y j iidZACS(φ ;K,π,λ 1,…,λ m ), where y j =(Y 1j ,…,Y mj ) for j=1,…,n. Let y j =(y 1j ,…,y mj ) denote the realization of the random vector y j , and \(Y_{\text {obs}} = \{{\boldsymbol {y}}_{j}\}_{j=1}^{n}\) be the observed data. Furthermore, let \({\mathbb {J}}=\{ j|{\boldsymbol {y}}_{j}=\textbf {0}, j=1,\dots,n \}\) and \(m_{0} = \sum _{j=1}^{n} I({\boldsymbol {y}}_{j}=\textbf {0})\) denote the number of elements in \({\mathbb {J}}\). We assume that K is a known positive integer. Therefore, the observed-data likelihood function is proportional to
$${\fontsize{9.3}{6} \begin{aligned} &L(\varphi,\pi,\boldsymbol{\lambda}|Y_{\text{obs}}) \\ &\propto \varphi^{m_{0}}(1-\varphi)^{n-m_{0}} \left\{\prod_{j \notin {\mathbb{J}}} \frac{\mathrm{e}^{- \lambda_{+}}}{1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}} \left[ \sum_{k_{j}=0}^{\min(K,{\boldsymbol{y}}_{j})} {K \choose k_{j}} \pi^{k_{j}} (1-\pi)^{K-k_{j}} \prod_{i=1}^{m} \frac{\lambda_{i}^{y_{ij}-k_{j}}}{(y_{ij}-k_{j})!} \right] \right\}. \end{aligned}} $$
Thus, we can write the log-likelihood function into two parts:
$$ \ell(\varphi,\pi,\boldsymbol{\lambda}|Y_{\text{obs}}) = \ell_{1}(\varphi|Y_{\text{obs}}) + \ell_{2}(\pi,\boldsymbol{\lambda}|Y_{\text{obs}}), $$
(5.1)
where
$$\begin{array}{@{}rcl@{}} \ell_{1}(\varphi |Y_{\text{obs}}) &=& m_{0}\log\varphi+(n-m_{0})\log(1-\varphi) \quad \text{and} \quad \\ [2mm] \ell_{2}(\pi, \boldsymbol{\lambda}|Y_{\text{obs}}) &=& -(n-m_{0})\left\{\lambda_{+} + \log[ 1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}]\right\} \\ [2mm] & & +\; \sum_{j \notin {\mathbb{J}}} \log\left[ \sum_{k_{j}=0}^{\min(K,{\boldsymbol{y}}_{j})} {K \choose k_{j}} \pi^{k_{j}} (1-\pi)^{K-k_{j}} \prod_{i=1}^{m} \frac{\lambda_{i}^{y_{ij}-k_{j}}}{(y_{ij}-k_{j})!} \right]. \end{array} $$
In other words, the parameter φ and the parameter vector (π,λ) can be estimated separately. Obviously, the MLE of φ has an explicit solution
$$ \hat{\varphi} = \frac{m_{0}}{n}, $$
(5.2)

but the closed-form MLEs of (π,λ) are not yet available.

5.1 MLEs via the EM algorithm and bootstrap CIs

The objective of this section is to find the MLEs of (π,λ) based on (5.1). For the log-likelihood function (3.1), the corresponding EM iteration for finding the MLEs of (π,λ) is defined by (3.3)–(3.5). By comparing (3.1) with (5.1), if we replace \((\sum _{j=1}^{n} w_{\textit {ij}})\) in (3.1) with \((\sum _{j \notin {\mathbb {J}}}y_{\textit {ij}})\), we promptly obtain the MLEs of (π,λ) by using the EM algorithm. The M-step is to calculate the complete-data MLEs:
$$ \hat{\pi} = \frac{\sum_{j \notin {\mathbb{J}}} x_{0j}^{*}}{(n - m_{0})K} \quad \text{and} \quad \hat{\lambda}_{i} = \frac{\sum_{j \notin {\mathbb{J}}} u_{j} y_{ij}}{(n-m_{0})} - K \hat{\pi}, \qquad i=1,\ldots,m, $$
(5.3)
and the E-step is to replace \(\{u_{j}\}_{j \notin {\mathbb {J}}}\) and \(\{ x_{0j}^{*} \}_{j \notin {\mathbb {J}}}\) in (5.3) by their conditional expectations:
$$\begin{array}{@{}rcl@{}} E(U_{j}|Y_{\text{obs}}, \pi, \boldsymbol{\lambda}) & = & E(U_{j}) = 1-(1-\pi)^{K}\mathrm{e}^{- \lambda_{+}}, \quad \text{and} \quad \end{array} $$
(5.4)
$$\begin{array}{@{}rcl@{}} [2mm] E(X_{0j}^{*}|Y_{\text{obs}}, \pi,\boldsymbol{\lambda}) & = & \frac{ [1-(1-\pi)^{K}\mathrm{e}^{-\lambda_{+}}] \sum\limits_{k_{j}=1}^{\min(K,{\boldsymbol{y}}_{j})} {K \choose k_{j}} \pi^{k_{j}} (1-\pi)^{K-k_{j}} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{y_{ij}-k_{j}}} {(y_{ij}-k_{j})!}} {\sum\limits_{k_{j}=0}^{\min(K,{\boldsymbol{y}}_{j})} {K \choose k_{j}} \pi^{k_{j}} (1-\pi)^{K-k_{j}} \prod\limits_{i=1}^{m} \frac{\lambda_{i}^{y_{ij}-k_{j}}} {(y_{ij}-k_{j})!}} \\ [2mm] & & \times \; I(\min({\boldsymbol{y}}_{j}) \ge 1), \end{array} $$
(5.5)

respectively.

The procedure of constructing bootstrap CIs for an arbitrary function of (φ,π,λ), say 𝜗=h(φ,π,λ), is very similar to that presented in Section 3.2.

Simulation studies

To evaluate the performance of the proposed methods in Section 3, we investigate the accuracy of MLEs and confidence interval estimators of the parameters in the multivariate ZTCS distribution. We consider two cases for the dimension with m=2 and m=3.

6.1 Experiment 1: m=2

When m=2, the parameters (K,π;λ 1,λ 2) are set to be (5, 0.5; 3, 5). We generate \(\{\textbf {w}_{j}\}_{j=1}^{n} \stackrel {\text {iid}}{\sim } \text {ZTCS}(K, \pi ; \lambda _{1}, \ldots, \lambda _{m})\) with n=200. Based on this simulated data set, for different K values we first calculate the MLEs of π and (λ 1,λ 2) by using the EM algorithm (3.3)–(3.5) and then calculate the estimated log-likelihood. We choose K=5 that maximizes the log-likelihood among all K values. These results are reported in Table 1.
Table 1

Finding the value of K by maximizing the log-likelihood function for m=2

K

\(\hat {\pi }\)

\(\hat {\lambda }_{1}\)

\(\hat {\lambda }_{2}\)

Log-likelihood

3

0.66742

3.4739056

5.4464959

−884.6629

4

0.58617

3.1314781

5.1040629

−883.4288

5

0.51145

2.9188539

4.8914319

−883.0151

6

0.44308

2.8176227

4.7901937

−883.1228

For this fixed value of K=5, we first calculate the MLEs of (π,λ 1,λ 2) by using the EM algorithm (3.3)–(3.5), the bootstrap standard deviations (stds) of these MLEs, the corresponding mean square errors (MSEs) and two 95 % bootstrap confidence intervals (CIs) of these parameters with G=1000 by the bootstrap method presented in Section 3.2. Then, we independently repeat the above process 1000 times. The resulting average MLE, std, MSE and two coverage probabilities (CPs) based on the normal-based and non-normal-based bootstrap samples, respectively, are displayed in Table 2.
Table 2

The average MLE, std, MSE and two CPs of (π,λ 1,λ 2) for m=2 and K=5

Parameter

True value

Average MLE

Average std B

Average MSE

CP

CP

π

0.5

0.511457

0.06417

0.004209

0.930

0.932

λ 1

3

2.918853

0.33051

0.114730

0.927

0.932

λ 2

5

4.891431

0.35926

0.139568

0.921

0.928

std B: The sample standard deviation for the bootstrap samples

CP : Normal-based bootstrap CP

CP : Non-normal-based bootstrap CP

From Table 2, we can see that the average MSE of \(\hat {\pi }\) is very small while the average MSEs of \((\hat {\lambda }_{1}, \hat {\lambda }_{2})\) are reasonably small. The two bootstrap coverage probabilities are close to but less than 0.95.

6.2 Experiment 2: m=3

When m=3, the parameters (K,π;λ 1,λ 2,λ 3) are set to be (4, 0.3; 2, 4, 6). We generate \(\{\textbf {w}_{j}\}_{j=1}^{n} \stackrel {\text {iid}}{\sim } \text {ZTCS}(K, \pi ; \lambda _{1}, \ldots, \lambda _{m})\) with n=200. Based on this simulated data set, for different K values we first calculate the MLEs of π and (λ 1,λ 2,λ 3) by using the EM algorithm (3.3)–(3.5) and then calculate the estimated log-likelihood. We choose K=4 that maximizes the log-likelihood among all K values. These results are reported in Table 3.
Table 3

Finding the value of K by maximizing the log-likelihood function for m=3

K

\(\hat {\pi }\)

\(\hat {\lambda }_{1}\)

\(\hat {\lambda }_{2}\)

\(\hat {\lambda }_{3}\)

Log-likelihood

3

0.4037377

1.9887834

4.2987810

5.9287793

−650.5100604

4

0.3203847

1.9184570

4.2284540

5.8584519

−649.8799727

5

0.2583191

1.9084000

4.2183967

5.8483944

−650.1021029

6

0.2137641

1.9174109

4.2274075

5.8574051

−650.1044541

For this fixed value of K=4, we first calculate the MLEs of (π,λ 1,λ 2,λ 3) by using the EM algorithm (3.3)–(3.5), the bootstrap stds of these MLEs, the corresponding MSEs and two 95 % bootstrap CIs of these parameters with G=1000 by the bootstrap method presented in Section 3.2. Then, we independently repeat the above process 1000 times. The resulting average MLE, std, MSE and two CPs based on the normal-based and non-normal-based bootstrap samples, respectively, are displayed in Table 4.
Table 4

The average MLE, std, MSE and two CPs of (π,λ 1,λ 2) for m=3 and K=4

Parameter

True value

Average MLE

Average std B

Average MSE

CP

CP

π

0.3

0.320384

0.0541052

0.00295

0.937

0.939

λ 1

2

1.918457

0.222460

0.05689

0.925

0.932

λ 2

4

4.228454

0.242082

0.06476

0.921

0.925

λ 3

6

5.858451

0.255456

0.09402

0.954

0.948

std B: The sample standard deviation for the bootstrap samples

CP : Normal-based bootstrap CP

CP : Non-normal-based bootstrap CP

From Table 4, we can see that the average MSEs of \(\hat {\pi }\) and \((\hat {\lambda }_{1}, \hat {\lambda }_{2}, \hat {\lambda }_{3})\) are very small. The two bootstrap coverage probabilities are close to 0.95.

Two real examples

7.1 Students’ absenteeism data

In this section, we use the data set on the number of absences of 113 students from a lecture course in two successive semesters reported by Karlis (2003) to illustrate the proposed statistical methods for the multivariate ZTCS distribution. Let W 1 denote the number of absences in the first semester and W 2 denote the number of absences in the second semester. The data are displayed in Table 5 below.
Table 5

Cross-tabulation of the students’ absenteeism data (Karlis 2003)

W 1W 2

0

1

2

3

4

5

6

7

8

Total

0

15

10

4

4

2

0

0

0

0

35

1

6

11

9

4

2

0

0

0

0

32

2

5

7

6

5

0

0

0

0

0

23

3

1

3

2

4

3

1

0

0

0

14

4

1

0

2

0

1

0

0

0

0

4

5

0

0

0

0

0

1

1

0

0

2

6

0

0

0

0

0

0

2

0

0

2

7

0

0

0

0

0

0

0

0

0

0

8

0

0

0

0

0

0

0

0

0

0

9

1

0

0

0

0

0

0

0

1

1

Total

29

31

23

17

8

2

3

0

0

113

For the purpose of illustration, we artificially remove the (0,0) cell counts from Table 5 and the updated data are shown in Table 6.
Table 6

The number of absences of 113 students from a course in two successive semesters without the (0, 0) cell counts (Karlis 2003)

W 1W 2

0

1

2

3

4

5

6

7

8

Total

0

10

4

4

2

0

0

0

0

20

1

6

11

9

4

2

0

0

0

0

32

2

5

7

6

5

0

0

0

0

0

23

3

1

3

2

4

3

1

0

0

0

14

4

1

0

2

0

1

0

0

0

0

4

5

0

0

0

0

0

1

1

0

0

2

6

0

0

0

0

0

0

2

0

0

2

7

0

0

0

0

0

0

0

0

0

0

8

0

0

0

0

0

0

0

0

0

1

9

1

0

0

0

0

0

0

0

1

1

Total

14

31

23

17

8

2

3

0

0

98

Let w j =(W 1j ,W 2j )iidZTCS(K,π;λ 1,λ 2) for j=1,…,n with n=98. Let w j =(w 1j ,w 2j ) denote the realization of the random vector w j , and \(Y_{\text {obs}} = \{{\boldsymbol {w}}_{j}\}_{j=1}^{n}\) be the observed data. The parameter K of the binomial distribution is considered unknown and it is attempted to estimate this. Based on the data in Table 6, for different K values we first calculate the MLEs of π and (λ 1,λ 2) by using the EM algorithm (3.3)–(3.5) and then calculate the estimated values of the log-likelihood function. These results are reported in Table 7.
Table 7

Finding the value of K by maximizing the log-likelihood function for fitting the data of Table 6 by the multivariate ZTCS distribution

K

\(\hat {\pi }\)

\(\hat {\lambda }_{1}\)

\(\hat {\lambda }_{2}\)

Log-likelihood

2

0.1220034

1.394314

1.600330

−328.8195

3

0.1024329

1.326026

1.531414

−328.1680

4

0.0869704

1.281892

1.486834

−327.7241

5

0.0748624

1.252965

1.457593

−327.4137

8

0.0517887

1.208834

1.412942

−326.8897

9

0.0468272

1.200906

1.404915

−326.7862

10

0.0427041

1.194675

1.398604

−326.7022

14

0.0314939

1.179167

1.382890

−326.4824

15

0.0295422

1.176680

1.380369

−326.4453

20

0.0225358

1.168154

1.371725

−326.3146

30

0.0152633

1.160049

1.363504

−326.1831

50

0.0092682

1.153806

1.357169

−326.0775

75

0.0062141

1.150805

1.354123

−326.0247

100

0.0046735

1.149330

1.352626

−325.9982

150

0.0031242

1.147871

1.351145

−325.9718

250

0.0018786

1.146717

1.349972

−325.9507

350

0.0013431

1.146225

1.349473

−325.9416

We should choose the K that maximizes the log-likelihood among all K values. From Table 7, we observed that the values of log-likelihood monotonically increase as K. On the other hand, K must be larger than or equal to max(W 1,W 2). From Table 6, we have max(W 1,W 2)=9. To illustrate how to obtain the confidence intervals of the parameters, it seems reasonable to choose K=10. With G=6000 bootstrap replications, we calculate the bootstrap average MLEs, the bootstrap stds of \((\hat {\pi }, \hat {\lambda }_{1}, \hat {\lambda }_{2})\) and two 95 % bootstrap CIs of (π,λ 1,λ 2). These results are listed in Table 8.
Table 8

MLEs and confidence intervals of parameters for the students’ absenteeism data

Parameter

MLE B

std B

95 % bootstrap CI

95 % bootstrap CI

π

0.043961

0.017672

[0.009325, 0.078598]

[0.007857, 0.078343]

λ 1

1.175550

0.210425

[0.763118, 1.587982]

[0.785992, 1.606149]

λ 2

1.377584

0.218819

[0.948699, 1.806468]

[0.965662, 1.824147]

MLE B: The average MLE for the bootstrap samples

std B: The sample standard deviation for the bootstrap samples

CI : Normal-based bootstrap CI

CI : Non-normal-based bootstrap CI

7.2 Road accident data of Athens

The number of accidents in 24 roads of Athens for the period 1987–1991 were reported and analyzed by Karlis (2003) with a multivariate Poisson distribution. Since only accidents that caused injuries are included as shown in Table 9, we want to fit the data set by the multivariate ZTCS model.
Table 9

Accident data of 24 roads in Athens for the period 1987–1991 (Karlis 2003)

Road

j

Year

Length(km)

  

1987

1988

1989

1990

1991

t j

Akadimias

1

11

33

25

23

6

1.2

Alexandras

2

41

63

91

77

29

2.6

Amfitheas

3

5

35

44

21

13

2.4

Aharnon

4

44

79

91

88

33

5.5

Vas. Olgas

5

5

3

4

4

0

0.5

Vas. Konstantinou

6

8

15

26

13

7

1.3

Vas. Sofias

7

34

63

81

67

23

2.6

Vouliagmenis

8

17

16

24

24

4

2.1

G’ Septemvriou

9

16

24

30

30

13

1.7

Galatsioy

10

13

13

15

17

9

1.1

Iera Odos

11

7

15

20

19

8

2.7

Kalirois

12

15

24

39

32

7

2.6

Katehaki

13

2

3

27

24

7

1.4

Kifisias

14

22

23

38

22

11

1.4

Kifisou

15

38

48

60

53

24

7.9

Leof. Kavalas

16

4

6

12

9

3

2.0

Lenorman

17

19

30

37

48

22

2.0

Leof. Athinon

18

15

11

16

21

28

6.1

Mesogeion

19

20

30

33

28

9

1.5

P. Ralli

20

13

14

13

17

9

2.6

Panepistimiou

21

24

58

40

36

5

1.1

Patision

22

80

108

114

113

86

4.1

Peiraios

23

86

89

109

90

49

8.0

Sigrou

24

60

61

87

86

29

4.8

Let w j =(W 1j ,…,W 5j )iidZTCS(K,π;λ 1,…,λ 5), where W 1j ,…,W 5j denote the average numbers of accidents reported in the j-th road per kilometer from 1987 to 1991, respectively, for j=1,…,n (n=24). For example, when j=1, we have t j =t 1=1.2 and
$$ (W_{11}, \ldots, W_{51})^{\!\top\!} = (11, 33, 25, 23, 6)^{\!\top\!}/1.2. $$
The unknown parameter K is assumed to be an positive integer. Based on the data in Table 9, for different K values we first calculate the MLEs of π and λ=(λ 1,…,λ 5) by using the EM algorithm (3.3)–(3.5) and then calculate the estimated values of the log-likelihood function. These results are reported in Table 10.
Table 10

Finding the value of K by maximizing the log-likelihood function for fitting the data of Table 9 by the multivariate ZTCS distribution

K

\(\hat {\pi }\)

\(\hat {\lambda }_{1}\)

\(\hat {\lambda }_{2}\)

\(\hat {\lambda }_{3}\)

\(\hat {\lambda }_{4}\)

\(\hat {\lambda }_{5}\)

Log-likelihood

2

0.396

8.405

13.520

16.667

14.513

5.262

−498.992

3

0.383

8.046

13.161

16.308

14.154

4.903

−492.749

4

0.372

7.707

12.822

15.969

13.815

4.564

−487.185

5

0.349

7.452

12.567

15.714

13.559

4.309

−482.820

8

0.292

6.858

11.973

15.120

12.966

3.715

−473.529

9

0.276

6.710

11.825

14.972

12.818

3.567

−471.336

10

0.261

6.585

11.700

14.847

12.693

3.442

−469.490

14

0.209

6.271

11.387

14.533

12.379

3.128

−464.604

15

0.198

6.226

11.341

14.488

12.333

3.083

−463.810

20

0.155

6.092

11.207

14.354

12.200

2.949

−461.170

30

0.106

5.997

11.112

14.259

12.105

2.854

−458.806

50

0.065

5.944

11.059

14.206

12.051

2.800

−457.125

51

0.064

5.942

11.057

14.204

12.050

2.799

−457.078

52

0.062

5.941

11.056

14.203

12.049

2.798

−457.033

53

0.061

5.940

11.055

14.202

12.047

2.797

−456.990

54

0.060

5.939

11.054

14.201

12.046

2.795

−456.949

75

0.043

5.923

11.038

14.185

12.030

2.779

−456.350

100

0.032

5.913

11.028

14.175

12.021

2.770

−455.978

150

0.021

5.905

11.020

14.167

12.012

2.762

−455.617

250

0.013

5.898

11.014

14.160

12.006

2.755

−455.334

350

0.009

5.896

11.011

14.158

12.003

2.753

−455.215

We should choose the K that maximizes the log-likelihood among all K values. From Table 10, we observed that the values of log-likelihood monotonically increase as K. On the other hand, K must be larger than or equal to max{W ij : 1≤i≤5, 1≤j≤24}. From Table 9, we have max{W ij : 1≤i≤5, 1≤j≤24}=52.7. To illustrate how to obtain the confidence intervals of the parameters, it seems reasonable to choose K=53. With G=6000 bootstrap replications, we calculate the bootstrap average MLEs, the bootstrap stds of \((\hat {\pi }, \hat {\lambda }_{1}, \ldots, \hat {\lambda }_{5})\) and two 95 % bootstrap CIs of (π,λ 1,…,λ 5). These results are reported in Table 11.
Table 11

MLEs and confidence intervals of parameters for the road accident data of Athens

Parameter

MLE B

std B

95 % bootstrap CI

95 % bootstrap CI

π

0.0663

0.017

[0.0327, 0.1000]

[0.0356, 0.1027]

λ 1

5.8610

1.217

[3.4750, 8.2470]

[3.8330, 8.4930]

λ 2

10.926

2.347

[6.3260, 15.526]

[7.1450, 16.313]

λ 3

14.118

1.824

[10.543, 17.694]

[10.844, 18.022]

λ 4

11.954

1.624

[8.7700, 15.138]

[9.2400, 15.478]

λ 5

2.7533

0.604

[1.5676, 3.9390]

[1.7988, 4.0889]

MLE B: The average MLE for the bootstrap samples

std B: The sample standard deviation for the bootstrap samples

CI : Normal-based bootstrap CI

CI : Non-normal-based bootstrap CI

Based on the data in Table 9, we calculate the sample correlation coefficient matrix, which is given by
$${\fontsize{9}{6}\begin{aligned} \textbf{R} = \left(\begin{array}{ccccc} 1.0000 & 0.8038 & 0.7643 & 0.8089 & 0.5746 \\ [2pt] 0.8038 & 1.0000 & 0.8326 & 0.8297 & 0.4084 \\ [2pt] 0.7643 & 0.8326 & 1.0000 & 0.9058 & 0.5768 \\ [2pt] 0.8089 & 0.8297 & 0.9058 & 1.0000 & 0.6557 \\ [2pt] 0.5746 & 0.4084 & 0.5768 & 0.6557 & 1.0000 \end{array} \right), \end{aligned}} $$
while the population correlation coefficient matrix ρ, based on (2.6) is estimated to be
$$ \boldsymbol{\hat{\rho}} = \left(\begin{array}{ccccc} 1.0000 & 0.2703 & 0.2444 & 0.2612 & 0.4199 \\ [2mm] 0.2703 & 1.0000 & 0.1951 & 0.2085 & 0.3352 \\ [2mm] 0.2444 & 0.1951 & 1.0000 & 0.1886 & 0.3031 \\ [2mm] 0.2612 & 0.2085 & 0.1886 & 1.0000 & 0.3240 \\ [2mm] 0.4199 & 0.3352 & 0.3031 & 0.324 & 1.0000 \end{array} \right). $$
it can be easily seen that \(\boldsymbol {\hat {\rho }}\) is very close to R.

Concluding remarks

In this paper, we first proposed the multivariate ZTCS distribution and studied its distributional properties. Since the joint marginal distribution of any r-dimensional sub-vector of the multivariate ZTCS random vector of m-dimensional has certain probability mass function, we then proposed the multivariate ZACS distribution. It is noted that the multivariate ZTCS distribution is a special case of the multivariate ZACS distribution. The EM algorithm is used to obtain the MLEs of the parameters in the multivariate ZACS distribution. The multivariate ZTCS distribution can be used when other distributions, like multivariate zero-truncated Poisson distribution is not a good fit to some real data sets. Meanwhile, the multivariate ZACS distribution, as a more general form, can be used in a much wider range. It can be a good substitute for the Type I I multivariate ZTP distribution (Tian et al. 2014).

Endnote

1 A discrete random variable X is said to have the general finite distribution, denoted by XFinite(x l ,p l ; l=1,…,n), if Pr(X=x l )=p l [ 0,1] and \(\sum _{l=1}^{n} p_{l}=1\).

Declarations

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

(1)
Department of Statistics and Actuarial Science, The University of Hong Kong

References

  1. Karlis, D: An EM algorithm for multivariate Poisson distribution and related models. J. Appl. Stat. 30(1), 63–77 (2003).View ArticleMathSciNetMATHGoogle Scholar
  2. Loukas, S: Some methods of estimation for a trivariate Poisson distribution. Zastoswania Math. 21, 503–510 (1993).MathSciNetMATHGoogle Scholar
  3. Loukas, S, Papageorgiou, H: On a trivariate Poisson distribution. Appl. Math. 36, 432–439 (1991).MathSciNetMATHGoogle Scholar
  4. Ong, SH: A discrete Charlier series distribution. Biom. J. 30(8), 1003–1009 (1988).View ArticleMathSciNetMATHGoogle Scholar
  5. Papageorgiou, H, Loukas, S: A bivariate discrete Charlier Series distribution. Biom. J. 37(1), 105–117 (1995).View ArticleMathSciNetMATHGoogle Scholar
  6. Tian, GL, Liu, Y, Tan, MT: Type II multivariate zero-truncated/adjusted Poisson models and their applications. Technical Report at Department of Statistics and Actuarial Science. The University of Hong Kong (2014).Google Scholar

Copyright

© Ding et al. 2015