The joint density of a Sarmanov–Lee distribution is given by
(5)
where f and g are the densities of the marginals F and G, while θ1 and θ2 are measurable functions satisfying the condition
which serves to ensure that h is a bona fide joint density.
When the marginals F and G are absolutely continuous, setting θ1 = 1 - 2F and θ2 = 1 - 2G shows that FGM is a special case of the Sarmanov–Lee.
Shubina and Lee (2004) showed that the maximum positive correlation of (5) is effected by concentrating all its mass on the (NE-SW) quadrants: {(x,y):(x - x0)(y - y0) ≥ 0}, and for the negative, on the (NW-SE) quadrants: {(x,y):(x - x0)(y - y0) ≤ 0}, for some real numbers x0 and y0.
The improvement in correlation is substantial. The maximum correlation, for the uniform marginals case, is 3/4 as opposed to 1/3 of the plain FGM and 0.434 of the (k=2) iterated FGM.
To further extend the Sarmanov–Lee family (5), a ‘generalized’ Sarmanov–Lee was proposed by Bairamov et al. (2001),
(6)
where the product function θ1(x) θ2(y) in (5) is replaced by T (F (x),G (y)) in which T is an integrable bivariate function on [0,1]2, satisfying
(7)
and the parameter α satisfies
(8)
where D+ = {(u,v) : T (u,v) > 0} and D- = {(u,v) : T (u,v) < 0}. Again, the constraints (7) and (8) together guarantee that the function h in (6) is a bona fide bivariate density with marginal densities f and g.
The new class (6) is so rich it contains members arbitrarily close to H+ and H-, the two extremal distributions (1) of Fréchet–Hoeffding. It is thus flexible enough to accommodate nearly the maximum positive (negative) correlation. Some simple examples help demonstrate the idea (more can be found in Lin and Huang 2011).
Example 1.
The generalized Sarmanov–Lee density (3 × 3) with uniform marginals. Consider the partitioning of the unit square into 3 × 3= 9 subsquares, induced by the lines x,y = 1/3,2/3. Let the function T be defined by
Then the joint density becomes
Its correlation is ρ (h) = 8/9, which exceeds the maximum, ρmax = 3/4, of the plain Sarmanov–Lee (with uniform marginals).
Example 2.
The generalized Sarmanov–Lee density (n × n) with uniform marginals. For n = 2,3,4,…, let
and
We have the correlation ρ (h
n
) = 1 - n-2 → 1 as n → ∞.
Example 3.
The generalized Sarmanov–Lee density (n×n) with exponential marginals. Let
where ln(n/0) ≡ ∞. Our calculations show that ρ(h
n
) ≈ 1 - 1.08/n → 1 as n → ∞.
The convergence to the maximal correlation is not limited to those with uniform or the exponential marginals. It holds for more general cases. The following Theorems 1 through 3 are successively stronger. Only Theorem 3 will be proved, since the others appeared in Lin and Huang (2011).
Let F and G be arbitrary distributions (not necessarily identical nor ‘of the same type’) with densities F′= f and G′= g. Define the joint density of (X
n
,Y
n
) by
(9)
where and are the left and right extremities of the distribution F, respectively. It qualifies as a generalized Sarmanov–Lee bivariate density (with marginal densities f and g).
Theorem 1.
For F = G, (9) becomes
If F has a finite variance σ2, then the correlation is
which converges to ρ (H+), which is 1 in this case, as n → ∞.
Theorem 2.
If in (9), X and Y satisfy any of the conditions:
-
F- 1 or G- 1 is uniformly continuous on (0,1),
-
a≤X, a′ ≤ Y a.s., (iii) X ≤b, Y ≤ b′a.s.,
-
X ≥ a, Y ≤ b a.s. and F- 1,G- 1 have continuous derivatives, where a, b, a′, b′ ∈ R,
then ρ (h
n
) converges to ρ (H+) as n → ∞.
Remarks 1.
Theorem 1 follows immediately from a general result, which is interesting in its own right: For any X ∼ F with finite E (X2),
The proofs of Remarks 1 and Theorem 2 are based on the following two lemmas.
Lemma 1.
(Chebyshev’s inequality for integrals.) Let f1, f2 : (a,b) → R be both increasing or both decreasing, and let p : (a,b) → (0,∞) be an integrable function. Then
provided that all integrals exist and are finite.
Remarks 2.
An extension of this inequality can be rephrased in probabilistic terms: Let X be any random variable and let f1, f2 be both increasing or both decreasing, then the covariance Cov(f1(X), f2(X)) = E [f1(X)f2(X)] - E [f1(X)]E [f2(X)] is non-negative, provided that the expectations E[f1(X)], E [f2(X)] and E [f1(X)f2(X)] exist.
Lemma 2.
(Euler–Maclaurin summation formula.) Let m < n be positive integers and let f be a real-valued function on [m,n]. Then we have(i) if f has a continuous derivative on [m,n],
(ii) if f has a continuous derivative of order 4,
where the remainder in which is the Bernoulli polynomial and B4 = B4(0) = - 1/30 the Bernoulli number.
Theorem 3.
For arbitrary marginals F and G with densities F′= f and G′= g, we have (i) the distribution H
n
of (9) converges weakly to H+ as n → ∞, and (ii) the correlation of H
n
converges to that of H+ as n → ∞, provided that F and G have finite variances.
Proof.
Note that the support of (9) is contained in the region bounded by the two curves and Therefore for any (x,y) in the region G (y) > F (x), we have Pr(X
n
< x, Y
n
> y) = 0 for all large n (n ≥ (G (y) - F (x))-1). This implies that H
n
(x,y) = Pr(X
n
≤ x, Y
n
≤ y)= Pr(X
n
≤ x) - Pr (X
n
≤ x, Y
n
> y) = Pr (X
n
≤ x) = F (x) = min {F (x),G (y)} = H+(x,y) for all n ≥ (G (y) - F (x))-1. Likewise, for each (x,y) in the region G (y) < F(x) we have H
n
(x,y) = G (y) = min {F (x),G (y)} = H+(x,y) for all large n. Finally, for each (x,y) with G (y) = F(x), for all n, and All together, we see that in all three cases, we have . This proves part (i). Part (ii) follows from the next lemma which can be proved by using Hölder’s inequality (see, e.g., the proof of Theorem 4 in Dou et al. 2013). □
Lemma 3.
Let (X
n
,Y
n
) ∼ H
n
be a sequence of bivariate random variables and X
n
∼ F, Y
n
∼ G for all n ≥ 1. Assume that (X
n
,Y
n
) converges in distribution to (X0,Y0) ∼ H0 as n tends to infinity. If, in addition, E[|X
n
|p + q], E[|Y
n
|p + q] < ∞ for some positive integers p and q, then
Remarks 3.
We wondered if the convergence of Theorem 3(ii) is monotone. Indeed, ρ (H
m
) ≥ ρ (H
n
) if m is a multiple of n (say, m = k n). To see this, write from (9)
where
by Chebyshev’s sum inequality (see, e.g., Dou et al. 2013, Lemma 1). This in turn implies that E (X
m
Y
m
) ≥ E(X
n
Y
n
) and hence ρ (H
m
) ≥ ρ(H
n
). That the sequence fails to be monotonically increasing can be seen from the following counterexample. Let f = g,f (x) = 1/6 if 0 ≤ a ≤ |x| ≤ a + 3, and f (x) = 0 otherwise. We have ρ (H3) < ρ (H2) for .
For certain marginals, the rate of convergence of Theorem 3(ii) can be determined (Lin and Huang 2011 and Theorem 4 below). The calculation is based on the following. Note that the correlation of two random variables is location-scale invariant.
Lemma 4.
For positive integer n > 1, we have (i)(ii), where R2(n) = (1) as n → ∞; (iii) as n → ∞.
Remarks 4.
Recall that the Euler constant Interestingly, each of the three remainder terms in Lemma 4 also converges to a real constant as n → ∞, say In fact, R1 = lnA ≈ 0.24875, where A ≈ 1.28243 is the so-called Glaisher–Kinkelin constant, and lnA can be represented as
where f1(x) = x lnx and h (x) = B4- B4(x - ⌊ x ⌋). Similarly,
where f2(x) = x2(lnx)2 and f3(x) = x (x + 1)(lnx) ln(x + 1).
Theorem 4.
(i) If F is uniform and if G is a power function distribution, then the convergence rate of ρ (h
n
) in (9) is 1/n2 as n → ∞. (ii) If F = U(0,1) and if G is exponential, then the convergence rate of ρ (h
n
) is (lnn)/n2 as n → ∞. More precisely, as n → ∞. (iii) If F = U(0,1) and if G is logistic, then the convergence rate of ρ (h
n
) is (lnn)/n2 as n → ∞. More precisely, ρ (h
n
) = 3/π - π- 1(lnn)/n2+ (n- 2) as n → ∞. (iv) If F = G is exponential, then ρ(h
n
) = 1+(n- 1) as n → ∞.
It is interesting to characterize the FGM and Sarmanov–Lee distributions by minimizing the χ2 divergence. Let h be the joint density of X and Y with marginal densities f = F′ and g = G′. Define the χ2 divergence (distance) between the joint density h and the product density fg (of independent random variables) by
(10)
where S
F
and S
G
are the supports of F and G, respectively.
Nelsen (1994) obtained a characterization of the FGM distributions by minimizing the χ2 divergence (10). Huang and Lin (2011) extended Nelsen’s (1994) result to the case of Sarmanov–Lee distributions. For i=1,2, consider the functions satisfying
(11)
where h
i
∈ (0,1]. Then we have the following.
Theorem 5.
Among all absolutely continuous bivariate distributions with marginal densities f = F′ and g = G′, the one whose joint density is closest to the product density of independent random variables (in the sense of minimizing the χ2 divergence) subject to the constraint where and are given in (11), is the Sarmanov–Lee distribution having joint density
The Sarmanov–Lee distribution and its generalization have been used in actuarial science, financial markets, electrical engineering and quantum statistical mechanics (see the references in Lin and Huang 2011). For example, Hernández-Bastida and Fernández-Sánchez (2012) applied a Sarmanov–Lee family to the Bayes premium in a collective risk model. In the analysis of longitudinal data, Cole et al. (1995) used a Sarmanov–Lee bivariate distribution for transition probabilities in a two-state Markov model and developed an empirical Bayes estimation methodology. Recently, Pelican and Vernic (2013a; 2013b) studied the parameter estimation problems for the Sarmanov–Lee distribution.