A flexible multivariate model for high-dimensional correlated count data

We propose a flexible multivariate stochastic model for over-dispersed count data. Our methodology is built upon mixed Poisson random vectors (Y1, . . . , Yd), where the {Yi} are conditionally independent Poisson random variables. The stochastic rates of the {Yi} are multivariate distributions with arbitrary non-negative margins linked by a copula function. We present basic properties of these mixed Poisson multivariate distributions and provide several examples. A particular case with geometric and negative binomial marginal distributions is studied in detail. We illustrate an application of our model by conducting a high-dimensional simulation motivated by RNA-sequencing data.

(1) Further, for continuous distributions with marginal probability density functions (PDFs) f i (x) = F i (x), the copula function C is unique, and the joint PDF of the {X i } is given by where the function c(u 1 , . . . , u d ) is the PDF corresponding to the copula C(u 1 , . . . , u d ).
However, for discrete distributions, the copula is no longer unique and there is no analogue of (2) for calculating the relevant probabilities. Using this concept, one can define a random vector Y = (Y 1 , . . . , Y d ) in R d with arbitrary marginal CDFs F i viz.
where U = (U 1 , . . . , U d ) is a random vector with standard uniform margins and the CDF given by with a particular copula C. While one can use any of the multitude of different copula functions in this construction, an approach based on Gaussian copula, known as NORTA (NORmal To Anything, see, e.g., Chen (2001); Song and Hsiao (1993)), is especially popular due to its flexibility, particularly in the case of discrete distributions (see, e.g., Barbiero and Ferrari (2017); Madsen and Birkes (2013); Nikoloulopoulos (2013)). While our approach involves copulas as well, the latter connect with continuous multivariate distributions rather than discrete, which avoids the issues with non-uniqueness of the copula function. Additionally, compared with the direct approach (3), in our scheme the computation of relevant probabilities is straightforward. Our methodology is based on mixtures of Poisson distributions, which is a common way of obtaining discrete analogs of continuous distributions on nonnegative reals with a particular stochastic interpretation. Indeed, discrete univariate mixed Poisson distributions have been proven useful stochastic models in many scientific fields (see, e.g., Karlis and Xekalaki (2005), where one can find a comprehensive review of these distributions with over 30 particular examples). This construction can be described through a randomly stopped Poisson process. More precisely, let {N(t), t ∈ R + } be a homogeneous Poisson process with rate λ > 0, so that the marginal distribution of N(t) is Poisson with parameter (mean) λt. Then, for any random variable T with cumulative distribution function (CDF) F T , supported on R + , the quantity Y = N(T) is an integer-valued random variable, with distribution determined viz. standard conditioning argument as follows: Many standard probability distributions on N 0 arise from this scheme. In particular, if T has a standard gamma distribution with shape parameter r > 0, given by the PDF then Y = N(T) will have a NB distribution NB(r, p) with the probability mass function (PMF) where p = 1/(1+λ) (see Section 3.2 in the Appendix). As the NB model is quite important across many applications and can be extended to more general stochastic processes (see, e.g., Kozubowski and Podgórski (2009)), it shall serve as a basic example of our approach. An extension of this scheme to the multivariate case can be accomplished in two different ways, leading to mixed multivariate Poisson distributions of Kind (or Type) I and II in the terminology of Karlis and Xekalaki (2005). The former arises viz.
where the {N i (·)} are Poisson processes with rates λ i and T is, as before, a random variable on R + , independent of the {N i }. While in general the marginal distributions of (N 1 (t), . . . , N d (t)) can be correlated multivariate Poisson (see, Johnson et al. (1997)), we shall assume that the processes {N i } are mutually independent. In this case, the joint probability generating function (PGF) of the {Y i } in (8) is of the form where φ T is the Laplace transform (LT) of T, while the relevant probabilities can be conveniently expressed as where the {g i } in (10) are the marginal PMFs of the {Y i }. As shown in the Appendix, the function h is of the form where In case of gamma distributed T, with shape parameter r > 0 and unit scale, the functions v and h above can be evaluated explicitly (see the Appendix for details), and the above distribution is know in the literature as the multivariate negative multinomial distribution (see Chapter 36 of Johnson et al. (1997) and references therein). Since the marginal distributions in this case are NB, the distribution has also been termed multivariate NB. In cases where the function v(·, ·) is not available explicitly, it can be easily evaluated numerically, viz. Monte Carlo simulations.
Our main focus will be a more flexible family of mixed Poisson distributions of Kind II, where each Poisson process {N i (t), t ∈ R + } is randomly stopped at a different random variable T i , leading to (2021) 8:6 Page 4 of 21 where the random vector T = (T 1 , . . . , T d ) follows a multivariate distribution on R d + . A particular special case of this construction with the {T i } having correlated log-normal distributions was recently proposed in Madsen and Dalthorp (2007), where this model was referred to as lognormal-Poisson hierarchy (L-P model). While that particular model does not allow explicit forms for marginal PMFs, it proved useful for applications. Our generalization, which we shall refer to as T-Poisson hierarchy, will allow T in (13) to have any continuous distribution on R d + , with margins not necessarily belonging to the same parametric family. As will be seen in the sequel, the joint PMF of this more general model can still be written as in (10), with an appropriate function h. In particular, we shall work with families of distributions of T described by marginal CDFs {F i } and a copula function C(u 1 , . . . , u d ). In this set-up, the PMF of Y, which is still of the form (10), can be expressed as where the g i are the marginal PMFs of {Y i }, the function c(u 1 , . . . , u d ) is the PDF corresponding to the copula C(u 1 , . . . , u d ), and the {X i } are independent random variables with certain distributions dependent on the {y i }. This expression, which is an analogue of (2) for discrete multivariate distributions defined through our scheme, provides a convenient way for computing probabilities of these multivariate distributions. This computational aspect of our construction compares favorably with a cumbersome formula for the PMF (see, e.g., Proposition 1.1 in Nikoloulopoulos and Karlis (2009)) of the competing method defined viz. (3).
In what follows, we explore these ideas to provide a flexible multivariate modeling framework for dependent count data -emphasizing computationally convenient expressions and scaleable algorithms for high-dimensional applications. We begin by showing how multivariate count data can be generated as mixtures of Poisson distributions by developing sequences of independent Poisson processes randomly stopped at an underlying continuous real-valued random variable T (a T−Poisson hierarchy). Then we show how our T−Poisson hierarchy scheme gives rise to computationally convenient joint probability mass functions (PMFs) and how particular choices of parameters/distributions can be used to construct well-known models such as the multivariate negative binomial. Next, we describe a scaleable simulation algorithm using our construction and copula theory. Two examples are provided: a basic example to produce a multivariate geometric distribution and an elaborate high-dimensional simulation study, aiming to model and simulate RNA-sequencing data. We note that our modeling framework and computationally-convenient formulas may facilitate novel data analysis strategies, but we do not take up that task in this current study. We conclude with an Appendix containing selected proofs of assertions made throughout.

Multivariate mixtures of Poisson distributions
Our goal is to produce a random vector Y = (Y 1 , . . . , Y d ) with correlated mixed Poisson components. To this end, we start with a sequence of independent Poisson processes {N i (t), t ∈ R + }, i = 1, . . . , d, where the rate of the process N i (t) is λ i . Next, we let (2021) 8:6 Page 5 of 21 In the terminology of Karlis and Xekalaki (2005), this is a special case of multivariate mixed Poisson distributions of Type II. Assuming that the {N i (t)} are independent of T, by standard conditioning arguments (see Lemma 7 in the Appendix) we obtain While in some cases one can obtain explicit expressions for the above joint probabilities, in general these have to be calculated numerically. The calculations can be facilitated by certain representations of these probabilities, discussed in the Appendix (see Lemmas 7 and 8 in the Appendix). This procedure is quite general, and leads to a multitude of multivariate discrete distributions. Flexible models allowing for marginal distributions of different types can be obtained by a popular approach with copulas. Assume that T has a continuous distribution on R d + with marginal PDFs f i and CDFs F i driven by a particular copula C(u 1 , . . . , u d ), so that the joint CDF of the {T i } is given by Then according to (2), the joint PDF f T is of the form where the function c(u 1 , . . . , u d ) is the PDF corresponding to the copula CDF C(u 1 , . . . , u d ). When we substitute (17) into (16), we get Using the results presented in the Appendix (see Lemma 7 in the Appendix), one can show that the marginal PMFs of the {Y i } are given by where f λ i T i (·) is the PDF of λ i T i and W has a standard gamma distribution with shape parameter y + 1. With this notation, we can write (18) in the form where the quantity g(t|y) in the above integral is the joint PDF of multivariate distribution with independent margins, with Thus, the integral in (20) can be expressed as where X = (X 1 , . . . , X d ) has a multivariate distribution with independent components, governed by the PDF specified by (21) -(22). This leads to the following result.
Proposition 1 In the above setting, the joint probabilities (18) admit the representation where the marginal probabilities are given by (19) and the PDF of X = (X 1 , . . . , X d ) is given by (21) -(22).
Let us note that the joint moments of the Y 1 , . . . Y d exist whenever their counterparts of T 1 , . . . , T d are finite, in which case they can be evaluated by standard conditioning arguments. In particular, the mean and the covariance matrix of Y are related to their counterparts connected with T in a simple way, specified by Lemma 9 in the Appendix. It so that the correlation coefficient of Y i and Y j (if it exists) is related to that of T i and T j as follows: where

Remark 1 While in general the correlation can be positive as well as negative and admits the same range as its counterpart for T i and T j , the range of possible correlations of Y i and Y j can be further restricted if the margins are fixed. The maximum and minimum correlation can be deduced from (25) -(26) and the range of correlation corresponding to the joint distribution of T i and T j . The later is provided by the minimum and the maximum correlations, corresponding to the lower and the upper Fréchet copulas,
The upper bound for the correlation is obtained when the distribution of (T i , T j ) is driven by the upper Fréchet copula C U in (27) where U is standard uniform and the F i (·), F j (·) are the CDFs of T i , T j , respectively. Similarly, the lower bound for the correlation is obtained when the distribution of (T i , T j ) is driven by the lower Fréchet copula C L in (27) . While these correlation bounds are usually not available explicitly, they can be easily obtained by Monte-Carlo approximations viz. simulation from these (degenerate) probability distributions or by other standard approximate methods (see, e.g., Demitras and Hedeker (2011), and references therein). (15) and the distribution of the corresponding T = (T 1 , T 2 ) is driven by one of the copulas in (27), then the distribution of T is not absolutely continuous and the above derivations leading to the PDF of Y need a modification. It can be shown that in this case the marginal distributions of the Y i are still given by (19) while the joint PMF of (Y 1 , Y 2 ) is also as in (20) with d = 2, but with the integral term replaced with 1 0 g 1 (u|y 1 )g 2 (u|y 2 )du and

Remark 2 We note that when a bivariate random vector
under the upper and the lower Fréchet copula cases, respectively, where the g i (·|y) in (28) are PDFs on (0, 1) given by Again, while the integrals in (28) are rarely available explicitly, they can be easily approximated by Monte-Carlo simulations in order to compute the joint PMF of Y = (Y 1 , Y 2 ) . These two "extreme" distributional cases can also be used to derive the full range of the values for the correlation of Y = (Y 1 , Y 2 ) when the marginal distributions (19) are fixed, if needed.

Mixed Poisson distributions with NB margins
We now consider the case where the mixed Poisson marginal distributions of Y are NB, so that the marginal distributions of T are gamma (see Lemma 1 in Appendix). Thus, we shall assume that the coordinates of the random vectors T have univariate standard gamma distributions with shape parameters r i ∈ R + , i = 1, . . . , d. There have been numerous multivariate gamma distributions developed over the years, and we could use any of them here. However, we follow a general approach based on copulas, discussed above. Thus, we assume that the dependence structure of T is governed by some copula function C(u 1 , . . . , u d ), which admits the PDF c(u 1 , . . . , u d ). In this case, the f i in (18) are given by (6) where r = r i and the F i are the corresponding CDFs. Here, the marginal PMFs of the {Y i } in (19) are given by where the NB probabilities are given by . Further, the PDF of X in Proposition 1 is still given by (21), where the marginal PDFs g i (·|y i ) now admit explicit expressions We recognize that these are gamma PDFs. Thus, in this special case of multivariate mixed Poisson distributions of Type II with NB marginal distributions, the random vector X in the representation (14) has multivariate gamma distribution as well, but with independent margins. This fact is summarized in the result below.
Corollary 1 Let Y have a mixed Poisson distribution defined viz. (15), where the {N i (·)} are independent Poisson processes with respective rates λ i and T has multivariate gamma Page 8 of 21 distribution with standard gamma margins with shape parameters r i and CDFs F i , governed by a copula PDF c(u). Then, the marginal PMFs of Y are given by (30) with p i = 1/(1 + λ i ) ∈ (0, 1) and its joint PMF is given by (14), where X = (X 1 , . . . , X d ) has multivariate gamma distribution with independent gamma marginal distributions of the {X i } with PDFs given by (31). (14) does not admit an explicit form in terms of the y 1 , . . . , y d , one can approximate its value viz. straightforward Monte-Carlo approximation involving random variate generation of independent gamma random variates {X i }.

Remark 3 If the expectation in
Let us note that since the {T i } have standard gamma distributions with shape parameters r i , we have E(T i ) = Var(T i ) = r i , and an application of Lemma 9 leads to the following result.

Remark 4 The correlation of Y i and Y j is still given by (25), where this time
since in (26) we have E(T i ) = Var(T i ). Let us note that while in principle the quantities c i,j can assume any value in (0, 1) when we choose appropriate λ i and λ j , they are fixed for particular marginal NB distributions, since in this model the NB probabilities are given by p i = 1/(1 + λ i ). In the terms of the latter, we have These quantities, along with the full range of correlations for ρ T i ,T j in (25), can be used to obtain the upper and lower bounds for possible correlations of Y i and Y j in this model. We note that the possible range of ρ T i ,T j depends on the shape parameters r i and r j . If the {T i } are exponential (so that r i = r j = 1), then the upper limit of their correlation can be shown to be 1. However, the full range for the correlation of T i and T j is usually a subset of [ −1, 1], which can be approximated by Monte-Carlo simulations (see Remarks 1-2) or other approximate methods (see, e.g., Demitras and Hedeker (2011)).

Simulation
One particular way of defining this model, convenient for simulations, is by using the Gaussian copula to generate T. This is a very popular methodology due to its flexibility and ease of simulating from a required multivariate normal distribution. The Gaussian copula is one that corresponds to a multivariate normal distribution with standard normal marginal distributions and covariance matrix R. Since the marginals are standard normal, this R is also the correlation matrix. If F R is the CDF of such multivariate normal distribution, then the corresponding Gaussian copula C R is defined through where (·) is the standard normal CDF. Note that the copula C R is simply the CDF of the random vector ( (X 1 ), . . . , (X d )) , where (X 1 , . . . , X d ) ∼ N d (0, R). If the distribution is continuous (so that R is non-singular), the copula C R admits the PDF c R , given by This c R will then be used in equations (20), (23), and (14). Simulation of multivariate gamma T with margins F i based on this copula is quite simple, and involves the following steps: Remark 5 This strategy of using Gaussian copula to generate multivariate distributions is quite popular indeed, and it became known in the literature as the NORTA (NORmal To Anything) method (see, e.g., Chen (2001); Song and Hsiao (1993)). This methodology has been recently used to generate multivariate discrete distributions, see, e.g., Barbiero and Ferrari (2017), Birkes (2013), or Nikoloulopoulos (2013) and references therein. The standard approach discussed in these papers proceeds by simulating the vector U from the Gaussian copula following the steps (i) -(ii) above and then transforming the coordinates of U directly viz. the inverse CDFs of the components of the target random vector Y = (Y 1 , . . . , Y d ) , which can be described as Here, the G i are the CDFs of the Y i . If the distributions of the Y i are discrete (such as NB), the inverse CDF is defined in the standard way as The difference of our approach and the one discussed in the literature as described above is in the final step, regardless of the particular copula c that is used. In the standard approach one first simulates random U from c and then proceeds viz. processes in the spirit of the NB process studied by Kozubowski and Podgórski (2009)

. On the other hand, its disadvantage is the fact that not all discrete marginal distributions can be obtained, only those that are mixed Poisson to begin with.
Remark 6 Let us note that the mixed Poisson approach to generate multivariate distributions was used in Madsen and Dalthorp (2007), where Y was obtained viz. (15) with standard Poisson processes and where T = e X with X being multivariate normal with mean μ = (μ 1 , . . . , μ d ) and covariance matrix =[ σ i,j ]. Since in this case the marginals of T have log-normal distributions, the authors referred to this construction as lognormal-Poisson hierarchy. This can be seen as a special case our scheme, where we have λ i = e μ i and the marginal CDFs of T i of the form F i (t) = (log t i /σ ii ). The copula PDF of the {T i } is the Gaussian copula (32) where R is the correlation matrix corresponding to .
An important aspect of this problem is how to set the parameters of the underlying copula function so that the distribution of Y has given characteristics, such as the means and the covariances (and correlations). In the case where a Gaussian copula is used, this has to do with determining the correlation matrix R. This problem arises in the general scheme (i)-(iii) as well -and has been discussed in the literature (see, e.g., Barbiero and Ferrari (2017); Xiao (2017); Xiao and Zhou (2019)). Generally, there is no simple relation between R and the correlation matrix of T in (i)-(iii). However, other measure of associationssuch as Kendall's τ or Spearman's ρ do transfer directly and may be preferred to use in our set-up. These issues will be the subject of further research.

Examples
We provide two examples. The first example describes the T-Poisson hierarchy approach to construct a multivariate geometric distribution. Second, we demonstrate how the T-Poisson hierarchy can be used to conduct a high-dimensional (d = 1026) simulation study inspired by RNA-sequencing data -a challenging computational task.

Multivariate geometric distributions
Suppose that the random vector T in (15) has marginal standard exponential distributions, so that the marginal CDFs of the {T i } are of the form In this case, the {Y i } have geometric distributions with parameters p i = 1/(1 + λ i ), so that One can then obtain a multitude of multivariate distributions with geometric margins by selecting various copulas for the underlying distributions of T. As an example, consider the case with Farlie-Gumbel-Morgenstern (FGM) copula driven by a parameter θ ∈[ −1, 1], given by In this case, the random vector X = (X 1 , X 2 ) in Corollary 1 has independent gamma margins (31) with shape parameters y i + 1 and scale parameters 1 + λ i , i = 1, 2. Using this fact, coupled with (33), one can evaluate the expectation in (14), leading to . (37) In view of Corollary 1, this leads to the following expression for the joint probabilities of bivariate geometric distribution defined by our scheme viz. FGM copula: We shall denote this distribution by GEO(p 1 , p 2 , θ). When θ = 0, the {Y i } are independent geometric variables with parameters p i ∈ (0, 1), i = 1, 2. Otherwise, Y 1 , Y 2 are correlated, with as can be verified by routine, albeit tedious, algebra. In turn, the correlation of Y 1 , Y 2 becomes and can generally take any value in the range (−1/4, 1/4).

Simulating RNA-seq data
This section describes how to simulate data using a T-Poisson hierarchy, aiming to replicate the structure of high-dimensional dependent count data. In fact, simulating RNA-sequencing (RNA-seq) data is a one of the primary motivating applications of the proposed methodology, seeking scaleable Monte Carlo methods for realistic multivariate simulation (for example, see Schissler et al. (2018)). The RNA-seq data generating process involves counting how often a particular messenger RNA (mRNA) is expressed in a biological sample. Since this is a counting process with no upper bound, many modeling approaches use discrete random variables with infinite support. Often the counts exhibit over-dispersion and so the negative binomial arises as a sensible model for the expression levels (gene counts). Moreover, the counts are correlated (co-expressed) and cannot be assumed to behave independently. RNA-seq platforms quantify the entire transcriptome in one experimental run, resulting in high-dimensional data. In humans, this results in count data corresponding to over 20,000 genes (coding genomic regions) or even over 77,000 isoforms when alternating spliced mRNA are counted. This suggests simulating high-dimensional multivariate NB with heterogeneous marginals would be useful tool in the development and evaluation of RNA-seq analytics.
In an illustration of our proposed methodology applied to real data, we seek to simulate RNA-sequencing data by producing simulated random vectors generated from the Type II Page 12 of 21 T-Poisson framework (as in Eq. (13)). Our goal is to replicate the structure of a breast cancer data set (BRCA: breast cancer invasive carcinoma data set from The Cancer Genome Atlas). For simplicity, we begin by filtering to retain the top 5% highest expressing genes of the 20,501 gene measurements from N = 1212 patients' tumor samples, resulting in d = 1026 genes. All these genes exhibit over-dispersion and, so, we proceed to estimate the NB parameters (r i , p i ), i = 1, . . . , d, to determine the target marginal PMFs g i (y i ) (via method of moments). Notably, thep i s are small -ranging in [ 3.934×10 −6 , 1.217×10 −2 ]. To complete the simulation algorithm inputs, we estimate the Pearson correlation matrix R Y and set that as the target correlation.
With the simulation targets specified, we proceed to simulate B = 10, 000 random vectors Y = (Y 1 , . . . , Y d ) with target Pearson correlation R Y and marginal PMFs g i (y i ) using a T-Poisson hierarchy of Kind II. Specifically, we first employ the direct Gaussian copula approach to generate B random vectors following a standard multivariate Gamma distribution T with shape parameters r i equal to the target NB sizes and Pearson correlation matrix R T . Care must be taken when setting the specifying R (refer to Eq. (32)) -we employ Eq. (25) to compute the scaling factors c i,j and adjust the underlying correlations to ultimately match the target R Y . Notably, of the 525,825 pairwise correlations from the 1026 genes, no scale factor was less than 0.9907, indicating the model can produce essentially the entire range of possible correlations. Here we are satisfied with approximate matching of the specified Gamma correlation and set R = R T in our Gaussian copula scheme (R indicating the specified multivariate Gaussian correlation matrix). Finally, we generate the desired random vector Y i = N i (T i ) by simulating Poisson counts with expected value μ i = λ i × T i , for i = 1, . . . , d, (with λ i = (1−p i ) p i ) and repeat B = 10, 000 times. Fig. 1 The T-Poisson strategy produces simulated random vectors from a multivariate negative binomial (NB) that replicate the estimated structure from an RNA-seq data set. The dashed red lines indicated equality between estimated parameters (vertical axes; derived from the simulated data) and the specified target parameters (horizontal axes)

Knudson et al. Journal of Statistical Distributions and Applications
(2021) 8:6 Page 13 of 21 Figure 1 shows the results of our simulation by comparing the specified target parameter (horizontal axes) with the corresponding quantities estimated from the simulated data (vertical axes). The evaluation shows that the simulated counts approximately match the target parameters and exhibit the full range of estimated correlation from the data. Utilizing 15 CPU threads in a MacBook Pro carrying a 2.4 GHz 8-Core Intel Core i9 processor, the simulation completed in less than 30 seconds.
which we recognize as the NB probability from (7) with p = (1 + λ) −1 . The result follows when we set λ = (1 − p)/p in the above analysis.

Mixed multivariate Poisson distributions of type I
Here we provide basic distributional facts about mixed multivariate Poisson distributions of Type I, which are the distributions of Y = (Y 1 , . . . , Y d ) = (N 1 (T), . . . , N d (T)) , where the {N i (·)} are independent Poisson processes with rates λ i and T is a random variable on R + , independent of the {N i }.
Lemma 2 In the above setting, the PGF of Y is given by where φ T is the LT of T. Proof By using standard conditioning argument, we have Since given T = t the variables {Y i } are independent and Poisson distributed with means {λ i t}, respectively, we have When we substitute the above into (41) we conclude that the PGF of Y is indeed of the form stated above.
Remark 7 Note that in the dimensional case d = 1, we recover the well-known formula for the PGF of Y = N(T), where λ > 0 is the rate of the Poisson process {N(t), t ∈ R + }. If we further assume that T is standard gamma distributed with shape parameter r > 0, so that where P i = λ i and Q = 1 + d i=1 P i . This is a general form of multivariate negative multinomial distribution (see Chapter 36 of Johnson et al. (1997)). Since the PGF of the marginal distributions of Y i in this setting is of the form (43) with p = (1 + λ i ) −1 , all marginal distributions are NB. Due to this property, discrete multivariate distributions with the above PGFs have been termed multivariate NB distributions (for more details, see Johnson et al. (1997)).
Remark 8 Let us note that changing a scaling factor of the variable T in this model has the same effect as adjusting the rate parameters connected with the Poisson processes {N i (·)}. Namely, it follows from Lemma 2 that if we letT = cT in the above setting, then we have the following equality in distribution: where the {Ñ i (·)} are independent Poisson processes with rates cλ i , respectively. Thus, without loss of generality, we may assume that the scale parameter of the variable T in this model is set to unity. Knudson et al. Journal of Statistical Distributions and Applications (2021)  Lemma 3 In the above setting, the PMF of Y is given by and the function h is given by Proof Since given T = t the variables {Y i } are independent and Poisson distributed with means {λ i t}, respectively, by using standard conditioning argument, followed by some algebra, we have Similarly, the marginal PMFs are given by By combining (45) and (46), we obtain the result.

Remark 9
Note that the joint PMF of Y can be also written as which is a convenient expression for approximating these probabilities by Monte Carlo simulations if the function v T (·, ·) is not available explicitly and the random variable T is straightforward to simulate. We also note that whenever the marginal PMFs of Y i are explicit, then so is the function v T (·, ·), which is clear from Lemma 3. For example, if T is standard gamma with shape parameter r, then we have v T (y, λ) = (r + y) (r) where Y has a NB distribution with parameters r and p = 1/(1 + λ).
Next, we present an alternative expression for the joint probabilities P(Y = y), which provides a convenient formula for their computation whenever the variable T is difficult to simulate but its PDF is easy to compute. This representation involves a multinomial random vector N = (N 1 , . . . , N d ) with parameters n and p = (p 1 , . . . , p d ) , denoted (2021) 8:6 Page 16 of 21 by MUL(n, p), where n ∈ N represents the number of trials, the {p i } represent event probabilities that sum up to one, and P(N = y) = n! y 1 ! · · · y d ! p Lemma 4 In the above setting, the PMF of Y is given by where λ = d i=1 λ i , N ∼ MUL(n, p) with n = d i=1 y i and p i = λ i /λ, the quantity f λT is the PDF of λT, and W has standard gamma distribution with shape parameter n + 1.
Proof Proceeding as in the proof of Lemma 3, we obtain Since the integrand is the product of f T (t) and the density of gamma random variable X with shape parameter n + 1 and scale λ, we have where W = λX has standard gamma distribution with shape parameter n + 1 (and scale 1). To conclude the result, observe that the expression (50) coincides with the multinomial probablity (48) with Remark 10 Note that in one dimensional case where d = 1 the multinomial probability in (49) reduces to 1, and we obtain where Y d = N(T), {N(t), t ∈ R + } is a Poisson process with rate λ, the quantity f λT is the PDF of λT, the variable W has standard gamma distribution with shape parameter y + 1, and T is independent of the Poisson process.
Finally, we present well-known results concerning the mean and the covariance structure of mixed multivariate Poisson distributions of Type I, which are easily derived through standard conditioning arguments. Generally, whenever the mean of T exists then so does the mean of each Y i , and we have E(Y i ) = λ i E(T). Moreover, the variance of each Y i is finite whenever T has a finite second moment, in which case we have Var(Y i ) = λ i E(T) + λ 2 i Var(T). Thus, the variance of Y i is greater than the mean, and the distribution of Y i is over-dispersed. Finally, under the latter assumption, the covariance of Y i and Y j exists and equals Cov(Y i , Y j ) = λ i λ j Var(T). The result below summarizes these facts.

Lemma 5
In the above setting, the mean vector of Y exists whenever the mean of T is finite, in which case we have E(Y) = λE(T), where λ = (λ 1 , . . . , λ d ) . Moreover, if T has a finite second moment, then the covariance matrix of Y is well defined and is given by Remark 11 By the above result, if it exists, the correlation coefficient of Y i and Y j is given by .
The correlation is always positive, and can generally fall anywhere within the boundaries of zero and one.

Mixed multivariate Poisson distributions of type II
Here where φ T is the LT of T, I(λ) is a d × d diagonal matrix with the {λ i } on the main diagonal, and 1 is a d-dimensional column vector of 1s.
Proof By using standard conditioning argument, we have Since given T = t the variables {Y i } are independent and Poisson distributed with means {λ i t i }, respectively, we have e −λ i t j (1−s j ) = e −t I(λ)(1−s) .
When we substitute the above into (53) we conclude that the PGF of Y is as stated in the lemma.
Remark 12 Note that the expression (52) is a generalization of (42) to the multivariate case of mixed Poisson. Additionally, observe that if the components of T coincide, that is T i = T for i = 1, . . . d, we have φ T (t) = E e −t T = E e −(t 1 +···+t d )T = φ T (t 1 + · · · + t d ), and the PGF in (52) reduces to its counterpart provided in Lemma 2, as it should. Knudson et al. Journal of Statistical Distributions and Applications (2021) 8:6 Page 18 of 21 Remark 13 Let us note that changing scaling factors of the variables T i in this model has the same effect as adjusting the rate parameters connected with the Poisson processes {N i (·)}. Namely, it follows from Lemma 6 that if we letT i = c i T i in the above setting, then we have the following equality in distribution: where the {Ñ i (·)} are independent Poisson processes with rates c i λ i , respectively. Thus, without loss of generality, we may assume that the scale parameters of the variables T i in this model are set to unity.
Next, we provide a convenient formula for the PMF of multivariate mixed Poisson distributions of Type II, which is an extension of that given in Lemma 3. To state the result, we extend the definition of the function v T described by (12) , y = (y 1 , . . . , y d ) ∈ N d 0 .
Proof By using standard conditioning argument, we have where y = (y 1 , . . . , y d ) and t = (t 1 , . . . , t d ) . Further, by independence, we have Since the N i (t i ) are Poisson with parameters λ i t i , we have When we now substitute (58) -(59) into (57), then after some algebra we get