A note on inconsistent families of discrete multivariate distributions
 Sugata Ghosh^{1},
 Subhajit Dutta^{1} and
 Marc G. Genton^{2}Email author
https://doi.org/10.1186/s4048801700618
© The Author(s) 2017
Received: 13 February 2017
Accepted: 31 May 2017
Published: 5 July 2017
Abstract
We construct a ddimensional discrete multivariate distribution for which any proper subset of its components belongs to a specific family of distributions. However, the joint ddimensional distribution fails to belong to that family and in other words, it is ‘inconsistent’ with the distribution of these subsets. We also address preservation of this ‘inconsistency’ property for the symmetric Binomial distribution, and some discrete distributions arising from the multivariate discrete normal distribution.
Keywords
AMS Subject Classification
Introduction
If a multivariate distribution is parametrically specified, often the lowerdimensional marginals follow the same distribution of an appropriate dimension. For example, all the lowerdimensional distributions of a multivariate Gaussian distribution are Gaussian. The converse is however not necessarily true. For example, Dutta and Genton (2014) gave a construction of a nonGaussian multivariate distribution with all lowerdimensional Gaussians and considered some generalizations of this for a certain class of elliptical and skewelliptical distributions. In the discrete case, all the lowerdimensional marginals of a bivariate Binomial distribution (Bairamov and Gultekin 2010) follow the Binomial distribution. Conversely, given a set of marginal distributions, different dependence structures can give rise to different joint distributions. In a more general setting, Hoeffding (1940) and Fréchet (1951) independently obtained a characterization of the class of bivariate distributions with given univariate marginals. Characterization problems for discrete distributions using conditional distributions and their expectations have been investigated by several authors; see, e.g., Dahiya and Korwar (1977), Ruiz and Navarro (1995), and Nguyen et al. (1996). The paper by Conway (1979) also discussed some additional properties and derived appropriate relationships for such systems, while a method of constructing multivariate distributions with specified univariate marginals and a given correlation matrix was studied by Cuadras (1992).
The main aim of this paper is the converse question for the discrete multivariate setup. More specifically, we construct a discrete multivariate distribution (see, e.g., Johnson et al. (1997)), all of whose lowerdimensional marginals follow the same symmetric distribution but the joint distribution does not conform to that pattern and has a different distribution. We say that the joint distribution is ‘inconsistent’ with the distribution of its marginals. The main idea of this construction is based on a transformation using the sign function applied to a class of discrete symmetric random variables. The sign function plays a key role in yielding a distribution with ‘restricted support’.
The structure of the paper is as follows. In Section 2, we first motivate our construction for the bivariate setup with a simple case starting from a symmetric Binomial distribution. Generalization of this result to symmetric discrete distributions with a specific support in higher dimensions is explored in Section 3 where our main result is stated. We also address preservation of this ‘inconsistency’ property for the symmetric Binomial distribution in Section 4, a class of symmetric discrete distributions constructed from symmetric continuous distributions in Section 5, and the multivariate discrete skewnormal distribution in Section 6. Proofs of all the theorems are provided in the Appendix.
A motivating bivariate example
This now implies that \(P(X_{1}^{*}=1)=1/4\). Hence, the distribution of \(X_{1}^{*}\) is the same as that of the distribution of X. Similarly, one can show that \(X_{2}^{*}\) is identically distributed with X.
This motivates us to construct a ddimensional random vector such that the distribution of any proper subset of its components belongs to a certain family of distributions, but the joint distribution, with all the d components taken together, fails to conform to that family of distributions.
General symmetric discrete multivariate distributions
where 0≤m _{ j }≤1 and \(m_{j} = m_{j} \text {~for all~} j \in \mathbb {S}\) with \(\sum _{j \in \mathbb {S}} m_{j} = 1\).
Now, \(\prod _{i=1}^{d} X_{i}^{*} = X_{1}S_{2,1} \cdot X_{2}S_{3,2} \cdots X_{d}S_{1,d} = X_{1}S_{1,d} \cdot X_{2}S_{2,1} \cdots X_{d}S_{d,d1}\). Fix i∈{2,…,d}. Again, X _{ i }<0⇒S _{ i,i−1}=−1, hence X _{ i } S _{ i,i−1}>0, while X _{ i }>0⇒S _{ i,i−1}=1, hence X _{ i } S _{ i,i−1}>0. Also, X _{ i }=0⇒X _{ i } S _{ i,i−1}=0. To summarize, X _{ i } S _{ i,i−1}≥0 for any i=2,…,d. Similarly, X _{1} S _{1,d }≥0. Hence, we obtain \(\prod _{i=1}^{d} X_{i}^{*} \geq 0\).
Theorems 1 and 2 below state the joint distribution of \(\left (X_{1}^{*}, \ldots, X_{d}^{*}\right)^{T}\) as well as the distribution of the lowerdimensional vectors.
Theorem 1
Theorem 2
Any subvector \(\left (\!X^{*}_{k_{1}}, \ldots, X^{*}_{k_{d'}}\!\right)^{T}\) of \(\left (X_{1}^{*}, \ldots, X_{d}^{*}\right)^{T}\) with d ^{′}<d is componentwise independent, and the joint distribution of \(\left (X^{*}_{k_{1}}, \ldots, X^{*}_{k_{d'}}\right)^{T}\) is the same as that of \(\left (X_{k_{1}}, \ldots, X_{k_{d'}}\right)^{T}\).
In particular, this result holds true if \(\mathbb {S}=\mathbb {Z}=\{\ldots,1,0,1,\ldots \}\) or any proper subset of \(\mathbb {Z}\) which is symmetric about 0, (i.e., \(x \in \mathbb {S} \Leftrightarrow x \in \mathbb {S}\)).
The symmetric binomial distribution
We started our investigation in Section 2 using a discrete distribution symmetric about 0 with support on the set \({\mathcal {N}}_{2}\), and extended it to the case of a general symmetric discrete distribution symmetric about 0, supported on a finite or countably infinite subset \(\mathbb {S}\) of \(\mathbb {R}\). However, it is interesting to see whether such ‘inconsistency’ results continue to hold for other symmetric distributions for which the point of symmetry is not necessarily 0.
Suppose that U follows a symmetric distribution with a finite or countably infinite support and point of symmetry u _{0}. Define X=U−u _{0}. Then, X follows a symmetric distribution about 0 with a finite or countably infinite support (as mentioned in (2)).
Assume U∼Binomial(n,1/2), and consider u _{0} to be n/2. We now have versions of Theorems 1 and 2 for symmetric Binomial distributions.
Discrete symmetric distributions
Assume F to be symmetric about the point 0, i.e., F(x)+F(−x)=1 for any \(x \in \mathbb {R}\). Now, note that q _{ d }(0)=q _{ d }(−1) and the point of symmetry of [U] is clearly \(\frac {1}{2}\). Define \(X=[\!U]+\frac {1}{2}\). Then, X is a discrete variate with support on a countably infinite set \(\mathbb {S}\) which is symmetric about the point 0, say, X∼q _{ dS }(·). In particular, if we take F(x)=Φ(x), where Φ(·) is the df of the standard normal distribution, then we obtain the discrete normal distribution (dN) (Roy 2003). The pmf of X (say, q _{ dN }(·)) simplifies to be Φ(x+1/2)−Φ(x−1/2) with \(x \in \mathbb {S}\). Using this discretization idea, Chakraborty and Chakravarty (2016) have proposed a discrete logistic distribution starting from the continuous twoparameter logistic distribution. Now, we can construct versions of Theorems 1 and 2 for such symmetric discrete probability distributions.
The multivariate discrete normal and related distributions
The joint pmf of X _{ d } is the product \(\prod \limits _{i=1}^{d} q_{dN}(x_{i})\).
Theorem 3
The distributions of all the (d−1)dimensional random vectors \(\mathbf {Y}_{d1,d}^{*}\) belong to the same family of distributions, but that of \(\,\mathbf {Y}_{d}^{*}\) does not.

(Discrete Scale Mixture) Consider A=0 (a degenerate random variable) and B=W, where W is a nonnegative, discrete random variable independent of \(\mathbf {X}_{d}^{*}\).

(Discrete SkewNormal) Consider a discrete normal variate X _{ d+1}∼q _{ dN }(·) independent of \(\mathbf {X}_{d}^{*}\). Define A=δX _{ d+1} and \(B=\sqrt {1\delta ^{2}}\) (a degenerate random variable) with −1<δ<1 (also see pp.128129 of Azzalini (2014)).
Using Theorem 3, we now obtain this ‘inconsistency result’ for a class of discrete skewnormal distributions. One can also extend this results to a class of skewelliptical distributions starting from the discrete scale mixture. More general results using the idea of modulation of symmetric discrete distributions proposed recently by Azzalini and Regoli (2014) still remain open.
Appendix: Proofs and Mathematical Details
Proof 1 (Proof of Theorem 1) We break the proof into two parts, namely, Case I and Case II.
Case I: \(\prod _{i=1}^{d} X_{i}^{*}=0\). This fact now implies that at least one of the \(X_{i}^{*}\)’s is zero. Without loss of generality, let \(X_{1}^{*}=0\) and consider the event \(\left (X_{1}^{*}=0, X_{2}^{*}=x_{2},\ldots, X_{d}^{*}=x_{d}\right)\). Under the assumption that \(X_{1}^{*}=0\), we now establish the following facts:
(F1) \(X_{i}^{*}=0 \Rightarrow X_{i}=0\) for any i=1,…,d. To show this, we note that if \(X_{1}^{*}=0\), then X _{1} S _{2,1}=0. Now, either X _{1}=0 or S _{2,1}=0. If X _{1}=0 then we are done. If S _{2,1}=0, then X _{2}=X _{1}=0, in particular X _{1}=0. More generally, \(X_{i}^{*}=0 \Rightarrow X_{i}=0\) for any i=1,…,d.
(F2) If \(X_{1}^{*}=0\), then X _{ d }=−x _{ d }. To show this, suppose that \(X_{d}^{*} \neq 0\), then we obtain X _{ d } S _{1,d }≠0 and in particular X _{ d }≠0. We have assumed that \(X_{1}^{*}=0\), now using (F1) we know that X _{1}=0. Combining the facts that X _{1}=0 and X _{ d }≠0, we have by definition S _{1,d }=−1. Again, \(X_{d}^{*}=x_{d} \Rightarrow X_{d}S_{1,d}=x_{d}\) and thus we obtain X _{ d }=−x _{ d }. On the other hand, suppose that \(X_{d}^{*} = 0\). In this case x _{ d }=0. By (F1), \(X_{d}^{*}=0 \Rightarrow X_{d}=0\). Thus we trivially have X _{ d }=−x _{ d } (both sides being equal to 0). Hence, in all the cases, we have X _{ d }=−x _{ d }.
(F3) We have X _{ i }=x _{ i } S _{ i+1,i } for i=2,…,d−1. To show this, we start with any i∈{2,…,d−1}. Now \(X_{i}^{*}=x_{i}\), which implies X _{ i } S _{ i+1,i }=x _{ i }. First take x _{ i }≠0. Then X _{ i }≠0 and S _{ i+1,i }≠0.
On the other hand, suppose x _{ i }=0. Using (F1), we obtain X _{ i }=0 which trivially satisfies X _{ i }=x _{ i } S _{ i+1,i } (as both sides equal zero). This is true for all i=2,…,d−1.
(F4) We have P(X _{ i }=x _{ i } S _{ i+1,i })=P(X _{ i }=x _{ i }). To show this, we first consider the case when x _{ i }≠0. Then S _{ i+1,i }=±1. Hence P(X _{ i }=x _{ i } S _{ i+1,i })=P(X _{ i }=±x _{ i }), which is P(X _{ i }=x _{ i }) by virtue of symmetry. The claim follows trivially when x _{ i }=0.
This completes the first part.
Thus, in this case we always have S(x)=±1. Also, note that here the modified sign function S _{ i,i−1} simplifies to S(X _{ i }), and hence \(X_{i}^{*}=X_{i}S(X_{i+1})\), i=1,…,d−1 and \(X_{d}^{*}=X_{d}S(X_{1})\).
We now consider enumerating the joint probability \(P\left (X_{1}^{*}=x_{1}, \ldots, X_{d}^{*}=x_{d}\right)\) with the restriction that \(\prod _{i=1}^{d} x_{i} > 0\).
Lemma 1

(i) \(\left (X_{1}=x_{1}, X_{2}=x_{2}S(x_{2}), X_{3}=x_{3}S(x_{2}x_{3}), \ldots, X_{d}=x_{d}S\left (\prod _{j=2}^{d} x_{j}\right)\right)\text {, or}\)

(ii) \(\left (X_{1}=x_{1}, X_{2}=x_{2}S(x_{2}), X_{3}=x_{3}S(x_{2}x_{3}), \ldots, X_{d}=x_{d}S\left (\prod _{j=2}^{d} x_{j}\right)\right).\)
Proof 2 (Proof of Lemma 1)
Note that \(X_{k}=X_{k}^{*}\) for all k=1,…,d. Also, in this case, the value of S(X _{2}) can either be 1, or −1. We now consider two separate cases.
Then \(X_{2}^{*}=X_{2}S(X_{3}) \Rightarrow X_{2}= \frac {X_{2}^{*}}{S(X_{3})}=X_{2}^{*}S(X_{3})=x_{2}S(x_{2})\). Here, we use the fact that S(u)=1/S(u).
Therefore, \(X_{4}^{*}=X_{4}S(X_{5}) \Rightarrow X_{4}= \frac {X_{4}^{*}}{S(X_{5})}=\frac {x_{4}}{S(x_{2}x_{3}x_{4})}=x_{4} S(x_{2}x_{3}x_{4})\). Proceeding in a similar fashion, we obtain X _{ d }=x _{ d } S(x _{2}⋯x _{ d }).
Case (ii): The proof follows by taking S(X _{2})=−1, and repeating the line of arguments stated above for Case (i).
This completes the proof of the second part.
This completes the proof. □
Proof 3 (Proof of Theorem 2)
First we consider the univariate distributions, namely, when d=1 and compute the probability \(P(X_{t}^{*}=x_{t})\), denoted by q ^{∗}(x _{ t }) for a fixed t∈{1,…,d}. Now, suppose that x _{ t }=0. Note that \(X_{t}^{*}=X_{t}S_{t+1,t}=0 \Leftrightarrow X_{t}=0\) since S _{ t+1,t }=0 also requires X _{ t } to be zero. We trivially have the reverse, i.e., \(X_{t}=0 \Rightarrow X_{t}^{*}=0\). So, we have \(X_{t}^{*}=0 \iff X_{t}=0\) and hence \(P\left (X_{t}^{*}=0\right) = P\left (X_{t}=0\right)\). Thus, we obtain q ^{∗}(x _{ t })=q(x _{ t }) for any t=1,…,d.
Let x _{(−t)} = (x _{1},…,x _{ t−1},x _{ t+1},…,x _{ d })^{ T }, and q(x _{(−t)}) = P(X _{(−t)}=x _{(−t)}) for t = 1,…,d. We want to compute the probability \(P\left (\mathbf {X}_{(t)}^{*}=\mathbf {x}_{(t)}\right)\), and denote it by q ^{∗}(x _{(−t)}). Further, we denote the joint probability P(X _{(−t)}=x _{(−t)},X _{ t }=x _{ t }) by q(x _{(−t)},x _{ t }), which is nothing but P(X _{1}=x _{1},…,X _{ d }=x _{ d }), and the joint probability \(P\left (\mathbf {X}_{(t)}^{*}=\mathbf {x}_{(t)}, X_{t}^{*}=x_{t}\right)\) by q ^{∗}(x _{(−t)},x _{ t }), which is nothing but \(P\left (X_{1}^{*}=x_{1},\ldots,X_{d}^{*}=x_{d}\right)\). We now consider three separate cases.
Case I: \(\phantom {\dot {i}\!}\prod _{i \in I_{(t)}}x_{i}=0\)
Case II: \(\phantom {\dot {i}\!}\prod _{i \in I_{(t)}}x_{i} > 0\)
Consider three separate subcases
(i) If x _{ t }=0, then \(\prod _{i=1}^{d} x_{i}=0\), and q ^{∗}(x)=q(x)=q(x _{(−t)})q(0). So, q ^{∗}(x _{(−t)},0)=q(x _{(−t)})q(0).
(ii) If x _{ t }>0, then \(\prod _{i=1}^{d} x_{i} > 0\). Thus, q ^{∗}(x)=2q(x)=2q(x _{(−t)})q(x _{ t }). Hence, \(q^{*}\left (\mathbf {x}_{(t)}, x_{t}\right)=2 q\left (\mathbf {x}_{(t)}\right) \sum _{t:x_{t}>0} q(x_{t})\).
(iii) If x _{ t }<0, then \(\prod _{i=1}^{d} x_{i} < 0\), which cannot happen by the construction of \(X_{i}^{*}\)’s. Hence, q ^{∗}(x _{(−t)},x _{ t })=0.
Case III: \(\phantom {\dot {i}\!}\prod _{i \in I_{(t)}}x_{i} < 0\)
Again, consider three separate subcases
(i) If x _{ t }=0, then \(\prod _{i=1}^{d} x_{i}=0\). Hence, q ^{∗}(x)=q(x)=q(x _{(−t)})q(0), i.e., q ^{∗}(x _{(−t)},0)=q(x _{(−t)})q(0).
(ii) If x _{ t }<0, then \(\prod _{i=1}^{d} x_{i} > 0\). Hence, q ^{∗}(x)=2q(x)=2q(x _{(−t)})q(x _{ t }), i.e., \(q^{*}\left (\mathbf {x}_{(t)}, x_{t}\right)=2 q\left (\mathbf {x}_{(t)}\right) \sum _{t:x_{t}<0} q(x_{t})\).
(iii) If x _{ t }>0, then \(\prod _{i=1}^{d} x_{i} < 0\). Hence, q ^{∗}(x _{(−t)},x _{ t })=0.
Hence, the joint distribution of \(\left (X_{k_{1}}^{*},\ldots,X_{k_{d'}}^{*}\right)^{T}\) is same as that of \(\left (X_{k_{1}},\ldots,X_{k_{d'}}\right)^{T}\). □
Proof 4 (Proof of Theorem 3)
(Proof of Theorem 3) Let us denote the joint distribution of (A,B) by the pmf G(a,b) with \(a,b \in \mathbb {S}\). Recall that we have assumed A,B to be independent of \(\mathbf {X}_{d}^{*}\). So, the conditional distribution of \(\mathbf {Y}_{d}^{*}\) given (A=a,B=b) is same as its unconditional distribution. We will use this fact throughout the proof of this theorem.
Recall the expression of q ^{∗}(x) from Eq. (5).
Thus, the random vectors \({\mathbf {Y}_{(t)}^{*}}\) for t=2,…,d−1 possess the same joint distribution. Similarly, one may argue that the random vectors \({\mathbf {Y}_{(1)}^{*}}\) and \({\mathbf {Y}_{(d)}^{*}}\) also follow the same joint distribution as well. Further, we can argue that any subvector \(\left (Y_{k_{1}}^{*}, Y_{k_{2}}^{*},\ldots,Y_{k_{d'}}^{*}\right)^{T}\) of \(\left (Y_{1}^{*}, Y_{2}^{*},\ldots, Y_{d}^{*}\right)^{T}\), where d ^{′}<d, has the same joint distribution. □
Declarations
Acknowledgements
We thank the Reviewers and Editors for comments that improved the paper.
Funding
This research was supported by the King Abdullah University of Science and Technology (KAUST).
Availability of data and materials
Not applicable.
Authors’ contributions
SG, SD, MG contributed equally to the research. All authors read and approved the final manuscript.
Competing interests
No competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
 Alzaatreh, A, Lee, C, Famoye, F: On the discrete analogues of continuous distributions. Stat. Methodol. 9, 589–603 (2012).MathSciNetView ArticleMATHGoogle Scholar
 Azzalini, A: With the collaboration of A. Capitanio. The SkewNormal and Related Families. Cambridge University Press, UK (2014).MATHGoogle Scholar
 Azzalini, A, Regoli, G: Modulation of symmetry for discrete variables and some extensions. Stat 3, 56–67 (2014).View ArticleGoogle Scholar
 Bairamov, I, Gultekin, OE: Discrete distributions connected with the bivariate Binomial. Hacettepe J. Math. Stat. 39, 109–120 (2010).MathSciNetMATHGoogle Scholar
 Chakraborty, S, Chakravarty, D: A new discrete probability distribution with integer support on (−∞,∞). Commun. Stat. Theory Methods. 45, 492–505 (2016).MathSciNetView ArticleMATHGoogle Scholar
 Conway, DA: Multivariate distributions with specified marginals (1979). Available at https://statistics.stanford.edu/sites/default/files/OLK%20NSF%20145.pdf. Accessed 10 June 2017.
 Cuadras, CM: Probability distributions with given multivariate marginals and given dependence structure. J. Multivar. Anal. 42, 51–66 (1992).MathSciNetView ArticleMATHGoogle Scholar
 Dahiya, R, Korwar, R: On characterizing some bivariate discrete distributions by linear regression. Sankhyā: Indian J. Stat. Series A. 39, 124–129 (1977).MathSciNetMATHGoogle Scholar
 Dutta, S, Genton, MG: A nonGaussian multivariate distribution with all lowerdimensional Gaussians and related families. J. Multivar. Anal. 132, 82–93 (2014).MathSciNetView ArticleMATHGoogle Scholar
 Fréchet, M: Sur les tableaux de corrélation dont les marges sont données. Ann. Univ. de Lyon Sect. A, Series 3. 14, 53–77 (1951).MATHGoogle Scholar
 Hoeffding, W: Massatabinvariate korrelationstheorie. Scriften Math. Inst. Univ. Berlin. 5, 181–233 (1940).Google Scholar
 Johnson, NL, Kotz, S, Balakrishnan, N: Discrete Multivariate Distributions. Wiley, New York (1997).MATHGoogle Scholar
 Nguyen, TT, Gupta, AK, Wang, Y: A characterization of certain discrete exponential families. Ann. Inst. Stat. Math. 48, 573–576 (1996).MathSciNetView ArticleMATHGoogle Scholar
 Roy, D: The discrete normal distribution. Commun. Stat. Theory Methods. 32, 1871–1883 (2003).MathSciNetView ArticleMATHGoogle Scholar
 Ruiz, JM, Navarro, J: Characterization of discrete distributions using expected values. Stat Papers. 36, 237–252 (1995).MathSciNetView ArticleMATHGoogle Scholar