 Research
 Open Access
 Published:
A flexible univariate moving average timeseries model for dispersed count data
Journal of Statistical Distributions and Applications volume 8, Article number: 1 (2021)
Abstract
AlOsh and Alzaid (1988) consider a Poisson moving average (PMA) model to describe the relation among integervalued time series data; this model, however, is constrained by the underlying equidispersion assumption for count data (i.e., that the variance and the mean equal). This work instead introduces a flexible integervalued moving average model for count data that contain over or underdispersion via the ConwayMaxwellPoisson (CMP) distribution and related distributions. This firstorder sumofConwayMaxwellPoissons moving average (SCMPMA(1)) model offers a generalizable construct that includes the PMA (among others) as a special case. We highlight the SCMPMA model properties and illustrate its flexibility via simulated data examples.
Introduction
Integervalued thinningbased models have been proposed to model time series data represented as counts. AlOsh and Alzaid (1988) introduce a generally defined integervalued moving average (INMA) process as an analog to the moving average (MA) model for continuous data which assumes an underlying Gaussian distribution. This INMA process instead utilizes a thinning operator that maintains an integervalued range of possible outcomes. To form such a model, they consider the “survivals” of independent and identically distributed (iid) nonnegative integer valued random innovations to maintain and ensure discrete data outcomes (Weiss 2021). AlOsh and Alzaid (1988) particularly consider a firstorder Poisson moving average (PMA(1)), i.e. a stationary sequence U_{t} of the form U_{t}=γ∘ε_{t−1}+ε_{t} where {ε_{t}} is a sequence of iid Poisson(η) random variables and \((\gamma \circ \epsilon) = \sum _{i=1}^{\epsilon } B_{i}\) for a sequence of iid Bernoulli(γ) random variables {B_{i}} independent of {ε}. By design, the PMA(1) is an INMA whose maximum stay time in the sequence is two time units. Consequently, components of U_{t} are dependent, while the components of ε_{t} and (γ∘ε_{t−1}) are independent.
Given the PMA(1) structure,
and the covariance of consecutive variables Cov(U_{t−1},U_{t})=γη; this implies that the correlation is
Meanwhile, the probability generating function (pgf) of U_{t} is \(\Phi _{U_{t}} (u) = e^{\eta (1+\gamma)(1u)}\), the joint pgf of {U_{1},…,U_{r}} is \(\Phi _{r}(u_{1}, \ldots, u_{r}) = \exp \left (\eta \left [r + \gamma  (1\gamma)\sum _{i=1}^{r}u_{i}  \gamma (u_{1} + u_{r})  \gamma \sum _{i=1}^{r1}u_{i} u_{i+1}\right ]\right)\) (which infers that time reversibility holds for the PMA), and the pgf of \(T_{U,r}= \sum _{i=1}^{r}U_{i}\) is
AlOsh and Alzaid (1988) note that T_{U,r} does not have a Poisson distribution, which is in contrast to the standard MA(1) process. The conditional mean and variance of U_{t+1} given U_{t}=u are both linear in U_{t}, namely
The PMA is a natural choice for modeling an integervalued process, in part because of its tractability (AlOsh and Alzaid 1988). This model, however, is limited by its constraining equidispersion property, i.e. the assumption that the mean and variance of the underlying process equal. Real data do not generally conform to this construct (Hilbe 2014; Weiss 2018); they usually display overdispersion relative to the Poisson model (i.e. where the variance is greater than the mean), however integervalued data are surfacing with greater frequency that express data underdispersion relative to Poisson (i.e. the variance is less than the mean). Accordingly, it would be fruitful to instead consider a flexible time series model that can accommodate data over and/or underdispersion.
Alzaid and AlOsh (1993) introduce a firstorder generalized Poisson moving average (GPMA(1)) process as an alternative to the PMA. The associated model has the form,
where \(\left \{\epsilon ^{*}_{t}\right \}\) is a sequence of iid generalized Poisson GP(μ^{∗},θ), and \(\{Q^{*}_{t}(\cdot)\}\) is a sequence of quasibinomial QB(p^{∗},θ/μ^{∗},·) random operators independent of \(\{\epsilon ^{*}_{t}\}\). As with the PMA, W_{t+r} and W_{t} are independent for r>1. The marginal distribution of W_{t} is GP((1+p^{∗})μ^{∗},θ). Recognizing the relationship between moving average and autoregressive models, Alzaid and AlOsh (1993) equate terms in this GPMA(1) model to their firstorder generalized Poisson autoregressive (GPAR(1)) counterpart,
where {ε_{t}} is a sequence of iid GP(qμ,θ) random variables where q=1−p, and {Q_{t}(·)} is a sequence of QB(p,θ/μ,·) random operators, independent of {ε_{t}}; i.e., they let μ=(1+p^{∗})μ^{∗} and \(p = \frac {p^{*}}{1+p^{*}}\). The bivariate pgf of W_{t+1} and W_{t} can thus be represented as
where A_{θ}(s) is the inverse function that satisfies A_{θ}(se^{−θ(s−1)})=s; see Alzaid and AlOsh (1993). This substitution in Eq. (7) to obtain Eq. (8) further illustrates the relationship between the GPMA(1) and GPAR(1) models such that they have the same joint pgf. Eq. (8) and the related GPAR work of Alzaid and AlOsh (1993) therefore show that
The joint pgf of (\(W_{t}, W_{t1}, \dots, W_{tr+1}\)) is given by
From the joint pgf, we see that the GPMA(1) is also timereversible, because it has the same dynamics if time is reversed. Further, the pgf associated with the total counts occurring during time lag r (i.e. \(T_{W,r}=\sum _{i=1}^{r}W_{tr+i}\)) is \(\Phi _{T_{w,r}}(u)=\exp \left [\mu q r(A_{\theta }(u)1) + \mu p(r1)(A_{\theta }(u^{2})1)\right ]\). Alzaid and AlOsh (1993) note that this result extends the analogous PMA result to the broader GPMA(1) model. Finally, the GPMA autocorrelation function is
where p=p^{∗}(1+p^{∗})^{−1}; by definition, ρ_{W}(r)∈[0,0.5] (Alzaid and AlOsh 1993).
Even though the GPMA can be considered to model over or underdispersed count time series, it may not be a viable option for count data that express extreme underdispersion; see, e.g. Famoye (1993). This work instead introduces another alternative for modeling integervalued time series data. The subsequent writing proceeds as follows. We first provide background regarding the probability distributions that motivate the development of our flexible INMA model. Then, we introduce the SCMPMA(1) model to the reader and discuss its statistical properties. The subsequent section illustrates the model flexibility through simulated and real data examples. Finally, the manuscript concludes with discussion.
Motivating distributions
While the above constructs show increased ability and improvement towards modeling integervalued time series data with various forms of dispersion, each of the models suffers from respective limitations. In order to develop and describe our SCMPMA(1), we first introduce its underlying motivating distributions: the CMP distribution and its generalized sumofCMPs distribution (sCMP), as well as the ConwayMaxwellBinomial (CMB) along with a generalized CMB (gCMB) distribution.
The ConwayMaxwellPoisson distribution and its generalization
The ConwayMaxwellPoisson (CMP) distribution (introduced by Conway and Maxwell (1962), and revived by Shmueli et al. (2005)) is a viable count distribution that generalizes the Poisson distribution in light of potential data dispersion. The CMP probability mass function (pmf) takes the form
for a random variable X, where λ=E(X^{ν})≥0,ν≥0 is the associated dispersion parameter, and \(\zeta (\lambda, \nu) = \sum _{s=0}^{\infty } \frac {\lambda ^{s}}{(s!)^{\nu }}\) is the normalizing constant. The CMP distribution includes three wellknown distributions as special cases, namely the Poisson (ν=1), geometric (ν=0,λ<1), and Bernoulli \(\left (\nu \rightarrow \infty \text { with probability} \frac {\lambda }{1+\lambda } \right)\) distributions.
The associated pgf of X is \(\Phi _{X}(u) = E(u^{X}) = \frac {\zeta (\lambda u, \nu)}{\zeta (\lambda, \nu)}\), and its moment generating function (mgf) is \(\mathrm {M}_{X}(u) = E(e^{Xu}) = \frac {\zeta (\lambda e^{u}, \nu)}{\zeta (\lambda, \nu)}\). The moments can meanwhile be represented recursively as
In particular, the expected value and variance can be written in the form and approximated respectively as
where the approximations are especially good for ν≤1 or λ>10^{ν} (Shmueli et al. 2005). This distribution is a member of the exponential family, where the joint pmf of the random sample x=(x_{1},…,x_{N}) is
where \(S_{1} = \sum _{i=1}^{N}x_{i}\) and \(S_{2} = \sum _{i=1}^{N} \log (x_{i}!)\) are joint sufficient statistics for λ and ν. Further, because the CMP distribution belongs to the exponential family, the conjugate prior distribution has the form, h(λ,ν)=λ^{a−1}e^{−νb}ζ^{−c}(λ,ν)δ(a,b,c), where λ>0,ν≥0, and δ(a,b,c) is a normalizing constant such that \(\delta ^{1}(a,b,c) = \int _{0}^{\infty } \int _{0}^{\infty } \lambda ^{a1} e^{b \nu } \zeta ^{c}(\lambda, \nu) d\lambda d\nu < \infty \).
Meanwhile, letting \(X_{*} = \sum _{i=1}^{n} X_{i}\) for iid random variables X_{i}∼CMP(λ,ν),i=1,…,n, we say that X_{∗} is distributed as a sumofCMPs [denoted sCMP(λ,ν,n)] variable, and has the pmf
where ζ^{n}(λ,ν) is the nth power of ζ(λ,ν), and \({{x_{*}} \choose a_{1}, \hspace {0.05in} \cdots, \hspace {0.05in} a_{n}} = \frac {{x_{*}}!}{a_{1}! \cdots a_{n}!}\) is a multinomial coefficient. The sCMP(λ,ν,n) distribution encompasses the Poisson distribution with rate parameter nλ (for ν=1), negative binomial(n,1−λ) distribution (for ν=0 and λ<1), and Binomial(n,p) distribution \(\left (\text {as}\ \nu \rightarrow \infty \text { with success probability} p=\frac {\lambda }{\lambda + 1}\right)\) as special cases. Further, for n=1, the sCMP(λ,ν,n=1) is simply the CMP(λ,ν) distribution.
The mgf and pgf for a sCMP(λ,ν,n) random variable X_{∗} are
respectively; accordingly, the sCMP(λ,ν) has mean E(X_{∗})=nE(X) and variance V(X_{∗})=nV(X), where E(X) and V(X) are defined in Eqs. (13)(14), respectively. Invariance under addition holds for two independent sCMP distributions with the same rate and dispersion parameters. See Sellers et al. (2017) for additional information regarding the sCMP distribution.
The ConwayMaxwellBinomial distribution and its generalization
The ConwayMaxwellBinomial distribution of Kadane (2016) (also known as the ConwayMaxwellPoissonBinomial distribution by Borges et al. (2014)) is a threeparameter generalization of the Binomial distribution. Denoted as CMB(d,p,ν) distributed, its pmf is
for some random variable Y where \(0 \le p \le 1, \nu \in \mathbb {R}\), and \(\chi (p, \nu, d) = \sum _{y=0}^{d} {d \choose y}^{\nu } p^{y}(1p)^{dy}\) is the associated normalizing constant. The Binomial(d,p) distribution is the special case of the CMB(d,p,ν) where ν=1. Meanwhile, ν>(<)1 corresponds to underdispersion (overdispersion) relative to the Binomial distribution. For ν→∞, the pmf is concentrated on the point dp while, for ν→−∞, the pmf is concentrated at 0 or d. For independent X_{i}∼CMP(λ_{i},ν),i=1,2, the conditional distribution of X_{1} given that X_{1}+X_{2}=d has a \(\text {CMB}\left (d, \frac {\lambda _{1}}{\lambda _{1}+\lambda _{2}}, \nu \right)\) distribution.
The pgf and mgf of Y have the form,
respectively, where \(\tau (\theta _{*}, \nu, d) = \sum _{y=0}^{d} {d \choose y}^{\nu } \theta _{*}^{y}\) for some θ_{∗}. The CMB distribution is a member of the exponential family whose joint pmf of the random sample y={y_{1},…,y_{N}} is
where \(S_{*1} = \sum _{i=1}^{N}y_{i}\) and \(S_{*2} = \sum _{i=1}^{N} \log [y_{i}! (dy_{i})!]\) are the joint sufficient statistics for p and ν. Further, its existence as a member of the exponential family implies that a conjugate prior family exists of the form,
where \(\omega (\theta _{*}, \nu) = \sum _{y=0}^{d} \theta _{*}^{y}/[y!(dy)!]^{\nu }, \psi ^{1}(a,b,c) = \int _{0}^{\infty } \int _{0}^{\infty } \theta _{*}^{a1} e^{\nu b} \omega ^{c}(\theta _{*}, \nu)d\theta _{*} d\nu < \infty \) (Kadane 2016).
Sellers et al. (2017) further introduce a generalized ConwayMaxwellBinomial (gCMB) distribution whose pmf is
for a random variable Z with parameters (p,ν,s,n_{1},n_{2}). As with the conditional probability of a CMP random variable given the sum of it and another independent CMP random variable sharing the same dispersion parameter, a special case of a gCMB distribution can be derived as the conditional distribution of X_{∗1}, given the sum X_{∗1}+X_{∗2}=d for independent sCMP random variables, X_{∗i}∼ sCMP (λ_{i},ν,n_{i}),i=1,2; the resulting distribution is analogously a gCMB\(\left (\frac {\lambda _{1}}{\lambda _{1}+\lambda _{2}},\nu, d, n_{1}, n_{2}\right)\) distribution. The gCMB distribution contains several special cases, including the CMB (d,p,ν) distribution (for n_{1}=n_{2}=1); the Binomial (d,p) distribution (when n_{1}=n_{2}=1 and ν=1); and, for λ_{1}=λ_{2}=λ, the hypergeometric distribution when ν→∞ and the negative hypergeometric distribution when ν=0 and λ<1.
Firstorder sCMP time series models
This section highlights two firstorder models for discrete time series data that have a sCMP marginal distribution, namely the firstorder sCMP autoregressive (SCMPAR(1)) model, and a firstorder SCMP moving average (SCMPMA(1)) model with the same marginal distribution structure.
Firstorder sCMP autoregressive (SCMPAR(1)) model
Sellers et al. (2020) introduce a firstorder sCMP autoregressive (SCMPAR(1)) model to describe count data correlated in time that express over or underdispersion. Based on the sCMP and gCMB distributions, respectively (as described in the “Motivating distributions” section with more detail available in Sellers et al. (2017)), we use the sCMP distribution to model the marginals of the firstorder integervalued autoregressive (INAR(1)) process as
where ε_{t}∼sCMP(λ,ν,n_{2}), and {C_{t}(∙): t=1,2,…} is a sequence of independent gCMB\(\left (\frac {1}{2}, \nu, \bullet, n_{1}, n_{2}\right)\) operators, independent of {ε_{t}}. This flexible INAR(1) model contains the firstorder Poisson autoregressive (PAR(1)) as described in several references (AlOsh and Alzaid 1987; McKenzie 1988; Weiss 2008), and the firstorder binomial autoregressive model of AlOsh and Alzaid (1991) as special cases. It likewise contains an INAR(1) model that allows for negative binomial marginals with a thinning operator whose pmf is negative hypergeometric.
The SCMPAR(1) model is yet another special case of the infinitely divisible convolutionclosed class of firstorder autoregressive (AR(1)) models described in Joe (1996), and satisfies the Markov property with the transition probability,
The SCMPAR(1) model has an ergodic Markov chain, thus X_{t} has a stationary sCMP (λ,ν,n_{1}+n_{2}) distribution that is unique. The joint pgf associated with the SCMPAR(1) model is
where the pgf is symmetric in u and l, and hence the joint distribution of X_{t+1} and X_{t} is time reversible. The regression form for the SCMPAR(1) process can be determined, and the general autocorrelation function for the process {X_{t}} is \(\rho _{r} = \text {Corr}(X_{t}, X_{tr}) = \left (\frac {n_{1}}{n_{1}+n_{2}}\right)^{r}\) for r=0,1,2,…. Parameter estimation can be conducted via conditional maximum likelihood with statistical computation tools (e.g. in R); see Sellers et al. (2020) for details.
Introducing the sCMPMA(1) model
Motivated by the SCMPAR(1) model of Sellers et al. (2020), we introduce a firstorder sumofCMPs moving average (SCMPMA(1)) process X_{t} by
where \({\epsilon ^{*}_{t}}\) is a sequence of iid sCMP (λ,ν,m_{1}+m_{2}) random variables and \(C^{*}_{t}(\bullet)\) is a sequence of independent gCMB (1/2,ν,∙,m_{1},m_{2}) operators independent of \({\epsilon ^{*}_{t}}\). By definition, X_{t} is a stationary process with the sCMP (λ,ν,2m_{1}+m_{2}) distribution, and X_{t+r} and X_{t} are independent for r>1. While this model can analogously be viewed as a special case of the infinitely divisible convolutionclosed class of discrete MA models (Joe 1996), unlike the sCMPAR(1) process, the sCMPMA(1) process is not Markovian.
The autocorrelation between X_{t} and X_{t+1} is
where \(C^{*}_{t+1}(\epsilon ^{*}_{t}) = \sum _{i=1}^{m_{1}} Y_{i}\) and \(\epsilon ^{*}_{t} = \sum _{i=1}^{m_{1}+m_{2}} Y_{i}\), respectively, are sCMP (λ,ν,m_{1}) and sCMP (λ,ν,m_{1}+m_{2}) random variables; i.e. each sCMP random variable can be viewed as respective sums of iid CMP (λ,ν) random variables, Y_{i}. Thus,
where, without loss of generality, we let Y denote any of the iid Y_{i} random variables. Meanwhile, because {X_{t}} is a sCMP (λ,ν,2m_{1}+m_{2}) distributed stationary process, we can likewise represent \(\text {Var}(X_{t}) = \text {Var}\left (\sum _{i=1}^{2m_{1}+m_{2}} Y_{i}\right) = \sum _{i=1}^{2m_{1}+m_{2}} \text {Var}(Y_{i}) = (2m_{1} + m_{2})\text {Var}(Y)\) for all t. We therefore find that
Because m_{1},m_{2}≥1, the onestep range of possible correlation values is 0≤ρ_{1}≤0.5. In particular, for m_{1}=m_{2}, we have the special case where ρ_{1}=1/3. Meanwhile, ρ_{k}=0 for all k>1 because, by definition of the SCMPMA(1) model assumptions, there is no dependent structure between X_{t} and X_{t+r} for r>1.
Recall from the “The ConwayMaxwellPoisson distribution and its generalization” section that \(\Phi _{G}(w) \,=\, \left (\frac {\zeta (\lambda w, \nu)}{\zeta (\lambda, \nu)} \right)^{\pi }\) is the pgf for a sCMP(λ,ν,π) distributed random variable, (say) G. Using this knowledge along with Eq. (21), the joint pgf can be derived as
where Eq. (23) is equivalent to Eq. (20) (i.e. the SCMPMA(1) process is comparable to the SCMPAR(1) process) when m_{1}=n_{1}=n_{2}−m_{2}. Given this comparison, we can easily determine the conditional mean E(X_{t+1}∣X_{t}=x) and conditional variance Var(X_{t+1}∣X_{t}=x). Eq. (23) further demonstrates that the SCMPMA(1) model is timereversible.
Parameter estimation via maximum likelihood (ML) is a difficult task with INMA models given the complex form of the underlying distributions. Even a conditional least squares approach does not appear to be feasible “because of the thinning operators, unless randomization is used” (Brännäs and Hall 2001). We therefore instead consider the following ad hoc procedure for parameter estimation. Given a data set with an observed correlation ρ_{1}, we first propose values for \(m_{1}, m_{2} \in \mathbb {N}\) that satisfy the constraint, \(\rho _{1} \approx \frac {m_{1}}{2m_{1} + m_{2}}\). Given m_{1} and m_{2} and recognizing that X_{t} is stationary with a sCMP(λ,ν,2m_{1}+m_{2}) distribution, we proceed with ML estimation to determine \(\hat {\lambda }\) and \(\hat {\nu }\) as described in Zhu et al. (2017) for conducting sCMP(λ,ν,s=2m_{1}+m_{2}) parameter estimation with regard to a CMP process over an interval of length s≥1. The corresponding variation for \(\hat {\lambda }\) and \(\hat {\nu }\) can be quantified via the Fisher information matrix or nonparametric bootstrapping. While the sampling distribution for \(\hat {\lambda }\) is approximately symmetric, the sampling distribution for \(\hat {\nu }\) is considerably rightskewed, hence analysts are advised to quantify estimator variation via nonparametric bootstrapping. While this is a means to an end, it only achieves in determining an appropriate distributional form regarding the data; it does not fully address the nature of the time series.
Data examples
To illustrate the flexibility of our INMA model, we consider various data simulations and a real data example. Below contains the respective details and associated commentary.
Simulated data examples
Table 1 reports the estimated mean, variance, and autocorrelation that result from various data simulations of SCMPMA(1) data given parameters (λ,ν,m_{1},m_{2}). In all examples, we let λ=0.5,m_{1},m_{2}∈{1,2}, and ν={0,0.5,1,2,35}, where ν=0 captures the case of extreme overdispersion, ν=1 denotes equidispersion, and ν=35 sufficiently illustrates the case computationally of utmost underdispersion where ν→∞.
For all examples, we find that the associated mean and variance compare with each other as expected, i.e. the variance is greater than the mean when ν<1 (i.e. the data are overdispersed), the variance and mean are approximately equal when ν=1 (i.e. equidispersion holds), and the variance is less than the mean (i.e. the data are underdispersed) when ν>1. In particular, we can easily verify that the three special case models perform as expected. For the Poisson cases (ν=1), we expect the mean and variance to both equal (2m_{1}+m_{2})λ, while the binomial cases (i.e. ν→∞ and \(p=\frac {\lambda }{\lambda +1}\)) produce a mean equal to \((2m_{1} +m_{2})\frac {\lambda }{\lambda +1}\) and variance equaling \((2m_{1}+m_{2}) \frac {\lambda }{\lambda +1} \left (1\frac {\lambda }{\lambda +1}\right)\), and the negative binomial cases (ν=0 with p=1−λ) have a mean of \(\frac {(2m_{1} +m_{2})\lambda }{1\lambda }\) and variance equaling \(\frac {(2m_{1} +m_{2})\lambda }{(1\lambda)^{2}}\). In fact, even with the ν→∞ case approximated by letting ν=35, we still obtain reasonable estimates for the mean and variance for all of the associated cases of m_{1} and m_{2}.
For each {m_{1},m_{2}} pair, the mean and variance both decrease as ν increases while, for all of the considered examples, we obtain estimated correlation values \(\hat {\rho }\) that approximately equal the true correlation, ρ. In particular, for those cases where m_{1}=m_{2}, we obtain \(\hat {\rho } \approx 1/3\) as expected (see Eq. (22)).
Real data example: IP address counts
Weiss (2007) considers a modified dataset regarding the number of unique IPaddresses which access the University of Wurzburg Department of Statistics’s webpages in 240 twominute intervals. Collected on November 29, 2005 (from 10:00:00 to 18:00:00), these data have an associated mean and variance equaling 1.286 and 1.205, respectively. Weiss (2007) considers a PAR(1) model, noting that “the empirical partial autocorrelation function indicates that a first order [autoregressive] model may be an appropriate choice” with \(\hat {\rho }_{1}=0.292\); Sellers et al. (2020), following suit, consider a SCMPAR(1) model as a flexible alternative to the PAR(1) model. The ACF and PACF plots of these data, however, do not clearly distinguish between considering a firstorder autoregressive or a moving average model; see Fig. 1ab. Further, recognizing that the data express apparent under to equidispersion, we therefore consider the SCMPMA(1) as an illustrative model for analysis.
We perform ML estimation assuming various combinations for (m_{1},m_{2}) (i.e. {(1,1), (1,2), (2,2)}) as these values contain the observed correlation, \(0.25 = \frac {1}{4} < \hat {\rho }_{1} < \frac {1}{3} \approx 0.33\). Table 2 contains the resulting parameter estimates for λ and ν, along with the respective Akaike Information Criterion (AIC). While the SCMPMA(1) model with m_{1}=m_{2}=2 has the lowest AIC among the four models considered, all of these models produce approximately equal AIC values (i.e. 695.2) where the increasing m_{1} and m_{2} values associate with decreasing \(\hat {\lambda }\) and increasing \(\hat {\nu }\). This makes sense because the resulting estimates rely solely on the assumed underlying sCMP (λ,ν,2m_{1}+m_{2}) distributional form for the data.
The dispersion estimates in Table 2 are all greater than 1, thus implying a perceived level of data underdispersion. These results naturally stem from the reported mean of the data (1.286) being greater than its corresponding variance (1.205). Their associated 95% confidence intervals (determined via nonparametric bootstrapping; also supplied in Table 2), however, are sufficiently large such that they contain ν=1. This suggests that the apparent data underdispersion is not statistically significant, thus instead suggesting that the data can be analyzed via the AlOsh and Alzaid (1988) PMA(1) model. It is further striking to see that the respective 95% confidence intervals associated with the dispersion parameter increase with the size of the underlying sCMP(2m_{1}+m_{2}) model. This is an artifact of the (s)CMP distribution, namely that the distribution of ν is a rightskewed distribution (as discussed in Zhu et al. (2017)). This approach confirms interest in the PMA(1) model where Eqs. (1)(2) imply that associated estimated parameters are \(\hat {\gamma } \approx 0.4124\) and \(\hat {\eta } \approx 0.9105\). Thus, we benefit from the SCMPMA(1) as a tool for parsimonious model determination.
Discussion
This work utilizes the sCMP distribution of Sellers et al. (2017) to develop a SCMPMA(1) model that serves as a flexible moving average time series model for discrete data where data dispersion is present. The SCMPMA(1) model captures the PMA(1), as well as versions of a negative binomial and binomial MA(1) structure, respectively, as special cases. This along with the flexible SCMPAR(1) can be used further to derive broader autoregressive moving average (ARMA) and autoregressive integrated moving average (ARIMA) models based on the sCMP distribution.
The SCMPMA(1) shares many properties with the analogous SCMPAR(1) model by Sellers et al. (2020). The presented models rely on predefining discrete values (i.e. m_{1},m_{2} for the SCMPMA(1)) for parameter estimation. As done in Sellers et al. (2017) and Sellers and Young (2019), we utilize a profile likelihood approach where, given m_{1} and m_{2}, we estimate the remaining model coefficients and then identify that collection of parameter estimates that produces the largest likelihood, thus identifying these parameter estimates as the MLEs. While this profile likelihood approach is acceptable as demonstrated in other applications, directly estimating m_{1},m_{2} along with the other SCMPMA(1) model estimates would likewise prove beneficial, as would redefining the model to allow for realvalued estimators for m_{1} and m_{2}. These generalizations and estimation approaches can be explored in future work.
Simulated data examples illustrate that the SCMPMA(1) model can obtain unbiased estimates, and the model demonstrates potential for accurate forecasts given data containing any measure of data dispersion. The real data illustration, however, highlights the complexities that come with parameter estimation. While we nonetheless present a means towards achieving this goal, this approach does not perform but so strongly with regard to prediction and forecasting. It nonetheless serves as a starting point for parameter estimation that we will continue to investigate in future work. Moreover, the flexibility of the SCMPMA(1) aids in determining a parsimonious model form as appropriate.
Availability of data and materials
Simulated data can vary given the generation process. Simulation code(s) can be supplied upon request. IP data set obtained from Dr. Christian Weiss of Helmut Schmidt University.
Abbreviations
 AR(1):

Firstorder autoregressive
 ARIMA:

Autoregressive integrated moving average
 ARMA:

Autoregressive moving average
 CMB:

ConwayMaxwellBinomial
 CMP:

ConwayMaxwellPoisson
 gCMB:

Generalized ConwayMaxwellBinomial
 GPAR(1):

Firstorder generalized Poisson autoregressive
 GPMA(1):

Firstorder generalized Poisson moving average
 INAR(1):

Firstorder integervalued autoregressive
 INMA:

Integervalued moving average
 MA:

Moving average
 mgf:

Moment generating function
 PAR(1):

Firstorder Poisson autoregressive
 pgf:

Probability generating function
 PMA:

Poisson moving average
 PMA(1):

Firstorder Poisson moving average
 QB:

Quasibinomial
 sCMP:

SumofConwayMaxwellPoisson
 SCMPAR(1):

Firstorder sumofConwayMaxwellPoisson autoregressive
 SCMPMA(1):

Firstorder sumofConwayMaxwellPoissons moving average
References
AlOsh, M. A., Alzaid, A. A.: Firstorder integer valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 8(3), 261–275 (1987).
AlOsh, M. A., Alzaid, A. A.: Integervalued moving average (INMA) process. Stat. Pap. 29(1), 281–300 (1988).
AlOsh, M. A., Alzaid, A. A.: Binomial autoregressive moving average models. Commun. Stat. Stoch. Model. 7(2), 261–282 (1991).
Alzaid, A. A., AlOsh, M. A.: Some autoregressive moving average processes with generalized Poisson marginal distributions. Ann. Inst. Stat. Math. 45(2), 223–232 (1993).
Borges, P., Rodrigues, J., Balakrishnan, N., Bazán, J.: A COMPoisson type generalization of the binomial distribution and its properties and applications. Stat. Probab. Lett. 87, 158–166 (2014).
Brännäs, K., Hall, A.: Estimation in integervalued moving average models. Appl. Stoch. Model. Bus. Ind. 17, 277–291 (2001).
Conway, R. W., Maxwell, W. L.: A queuing model with state dependent service rates. J. Ind. Eng. 12, 132–136 (1962).
Famoye, F.: Restricted generalized Poisson regression model. Commun. Stat. Theory Methods. 22(5), 1335–1354 (1993).
Hilbe, J. M.: Modeling Count Data. Cambridge University Press, New York, NY (2014).
Joe, H.: Time series models with univariate margins in the convolutionclosed infinitely divisible class. J. Appl. Probab. 33(3), 664–677 (1996).
Kadane, J. B.: Sums of possibly associated Bernoulli variables: The ConwayMaxwellBinomial distribution. Bayesian Anal. 11(2), 403–420 (2016).
McKenzie, E.: ARMA models for dependent sequences of Poisson counts. Adv. Appl. Probab. 20(4), 822–835 (1988).
Sellers, K. F., Peng, S. J., Arab, A.: A flexible univariate autoregressive timeseries model for dispersed count data. J. Time Ser. Anal. 41(3), 436–453 (2020). https://doi.org/10.1111/jtsa.12516.
Sellers, K. F., Swift, A. W., Weems, K. S.: A flexible distribution class for count data. J. Stat. Distrib. Appl. 4(22), 1–21 (2017). https://doi.org/10.1186/s4048801700770.
Sellers, K. F., Young, D. S.: Zeroinflated sum of ConwayMaxwellPoissons (ZISCMP) regression. J. Stat. Comput. Simul. 89(9), 1649–1673 (2019).
Shmueli, G., Minka, T. P., Kadane, J. B., Borle, S., Boatwright, P.: A useful distribution for fitting discrete data: revival of the ConwayMaxwellPoisson distribution. Appl. Stat. 54, 127–142 (2005).
Weiss, C. H.: Controlling correlated processes of Poisson counts. Qual. Reliab. Eng. Int. 23(6), 741–754 (2007).
Weiss, C. H.: Thinning operations for modeling time series of counts–a survey. Adv. Stat. Anal. 92, 319–341 (2008).
Weiss, C. H.: An Introduction to DiscreteValued Time Series. John Wiley & Sons, Inc., Hoboken, NJ (2018).
Weiss, C. H.: Stationary count time series models. Wiley Interdiscip. Rev. Comput. Stat. 13(1), 1502 (2021). https://doi.org/10.1002/wics.1502.
Zhu, L., Sellers, K. F., Morris, D. S., Shmuéli, G.: Bridging the gap: A generalized stochastic process for count data. Am. Stat. 71(1), 71–80 (2017).
Acknowledgements
This paper is released to inform interested parties of research and to encourage discussion. The views expressed are those of the authors and not necessarily those of the U.S. Census Bureau. SM and FC thank the Georgetown Undergraduate Research Opportunities Program (GUROP) for their support. All authors thank Dr. Christian Weiss for use of the IP dataset, and the reviewers for their feedback and comments.
Funding
SM was funded in part by the GUROP.
Author information
Affiliations
Contributions
KFS developed the research idea. All authors contributed towards the literature review, theoretical developments, and statistical computing. The author(s) read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
No authors have competing interests relating to this work.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sellers, K.F., Arab, A., Melville, S. et al. A flexible univariate moving average timeseries model for dispersed count data. J Stat Distrib App 8, 1 (2021). https://doi.org/10.1186/s40488021001152
Received:
Accepted:
Published:
Keywords
 Overdispersion
 Underdispersion
 ConwayMaxwellPoisson (COMPoisson or CMP)
 SumofConwayMaxwellPoisson (sCMP)