Joint distribution of rank statistics considering the location and scale parameters and its power study

Lee, Wan-Chen

doi:10.1186/2195-5832-1-6

Research
Open access
Published: 11 June 2014

Joint distribution of rank statistics considering the location and scale parameters and its power study

Wan-Chen Lee¹

Journal of Statistical Distributions and Applications volume 1, Article number: 6 (2014) Cite this article

3034 Accesses
1 Citations
Metrics details

Abstract

The ranking method used for testing the equivalence of two distributions has been studied for decades and is widely adopted for its simplicity. However, due to the complexity of calculations, the power of the test is either estimated by a normal approximation or found when an appropriate alternative is given. Here, via the Finite Markov chain imbedding technique, we are able to establish the marginal and joint distributions of the rank statistics considering the shift and scale parameters, respectively and simultaneously, under two different continuous distribution functions. Furthermore, the procedures of distribution equivalence tests and their power functions are discussed. Numerical results of a joint distribution of rank statistics under the standard normal distribution and the powers for a sequence of alternative normal distributions with means from −20 to 20 and standard deviations from 1 to 9 and their reciprocal are presented. In addition, we discuss the powers of the rank statistics under the Lehmann alternatives.

2010 Mathematics Subject Classification

Primary 62G07; Secondary 62G10

1 Introduction

Suppose that on the basis of observations X₁,…,X_m;Y₁,…,Y_n from the cumulative distribution functions F and G, two major topics in the hypothesis testing are to test the equivalence of either the center or the dispersion of the two populations of interest. The hypotheses are stated, for some θ ≠ 0,

H_{o} : F (x) = G (x) versus H_{a} : F (x) = G (x - θ), for all x,

which is known as the shift alternative and, for some σ≠1,

H_{o} : F (x) = G (x) versus H_{a} : F (x) = G (x σ^{- 1}), for all x.

Wilcoxon (1945) proposed the ranking method for testing the significance of the difference of the two populations means, also known as the Wilcoxon rank-sum test, and defined a statistic W_Y, as the sum of the ranks of the y^′s in the combined and ordered sequence of x^′s and y^′s, equivalent to

\sum_{j = 1}^{n} \{# of x_{i}^{'} s < y_{j}\} + \frac{n (n + 1)}{2} .

Mann and Whitney (1947) introduced an elaboration of the ranking test, proposed the statistic $U_{X} = mn - W_{Y} + \frac{n (n + 1)}{2}$ , and proved that the limiting distribution of the test statistic U_X is

\frac{U_{X} - E (U_{X})}{\sqrt{Var (U_{X})}} \overset{L}{\to} N (0, 1)

as m and n go to infinity in any arbitrary manner where

E (U_{X}) = mn p_{1}

and

Var (U_{X}) = {mnp}_{1} (1 - p_{1}) + mn (n - 1) (p_{2} - p_{1}^{2}) + mn (m - 1) (p_{3} - p_{1}^{2}),

with

\begin{array}{lcr} p_{1} & = & P (X > Y), \\ p_{2} & = & P (X > Y and X > Y^{'}), \\ p_{3} & = & P (X > Y and X^{'} > Y), \end{array}

(1)

where X,X^′ and Y,Y^′ are independently distributed, X,X^′ with the distribution F, and Y,Y^′ with the distribution G. Intuitively, the power for the right-sided test can be found as

\begin{array}{lcr} P (\frac{U_{X} - E (U_{X})}{\sqrt{Var (U_{X})}} > \frac{c - E (U_{X})}{\sqrt{Var (U_{X})}} | H_{a}), \end{array}

(2)

where c is the value such that

Φ (\frac{c - \frac{1}{2} mn}{\sqrt{\frac{1}{12} mn (m + n + 1)}} | H_{o}) \geq 1 - α.

Over the years, there have been studies on finding the exact or approximate power for the rank-sum test. By choosing an appropriate alternative distribution function, Shieh et al. (2006) derived the exact power for the uniform, normal, double exponential and exponential shift models. Rosner and Glynn (2009) discussed power against the family of alternatives of the form

Φ^{- 1} (F_{Y} (y)) = Φ^{- 1} (F_{X} (y)) + μ forsome μ \neq 0,

where the underlying distributions F_X and F_Y are normal. Collings and Hamilton (1988) presented a bootstrap method to find the empirical distribution functions in order to approximate the power against the shift alternative. Lehmann (1953) derived the power function as

\begin{array}{lcr} P (S_{1} = s_{1}, S_{2} = s_{2}, \dots, S_{n} = s_{n}) = \frac{k^{n}}{(\binom{m + n}{m})} \prod_{j = 1}^{n} \frac{Γ (s_{j} + jk - j)}{Γ (s_{j})} \frac{Γ (s_{j + 1})}{Γ (s_{j + 1} + jk - j)}, \end{array}

where s_j is the rank of y_j in the combined samples for the alternative hypothesis of

G_{Y} (x) = F_{X} {(x)}^{k}, forall x,

where k is a positive integer. However, Lehmann (1998) pointed out that the power function of the rank-sum test, Equation (2), was only qualitative. Since the numerical values for assessing the probabilities in Equation (1) are considerably complicated in computation when F and G are continuous distributions with F≠G.

As the rank-sum test is widely adopted for testing the center differences of two distributions, it is natural to study the efficiency of a rank-sum test for variability (Ansari and Bradley 1960). For decades, studies have focused on proposing new definitions of the rank statistic and using the methods of Chernoff and Savage to show the relative efficiency of the proposed statistic to the F-test, see for example Mood (1954), Siegel and Tukey (1960), Ansari and Bradley (1960), and Klotz (1962). Ansari and Bradley (1960) mentioned that if the means of the X and Y samples cannot be considered equal, differences in location have a severe impact on all the tests of dispersion. Klotz (1962) showed the power of a rank test can be found by integrating the joint density of X and Y samples over that part of the m+n dimensional space defined by the alternative orderings which lie in the critical region of the test, for which conditions are very strict.

Our approach aims at releasing some of the conditions for finding the distribution of the proposed rank statistic. We systematically imbed the random vector U_n into a Markov chain to induce the marginal and joint distributions of the rank statistics considering the shift and scale parameter, respectively, under any form of two distribution functions. A joint distribution of rank statistics, to the best of our knowledge, has not been studied in the literature. The main strength of using the finite Markov chain imbedding approach (FMCI) is to derive the distribution of the rank statistic without giving any conditions. Therefore, under the null hypothesis of F=G, we are able to identify a proper critical region and, under the alternative assumption, the power of the test can be determined naturally. The distribution of the random vector U_n, independent of the form of the distribution function F, is also demonstrated under the null hypothesis of the distribution equivalence.

The main contributions of this paper are as follows. In Section 2.1, we introduce the procedures of deriving the distribution of the rank statistic considering the shift parameter and its power function by using FMCI. The procedures are general and can be applied to either two identical distribution functions of interest or two different continuous density functions. In Section 2.2, we address the steps for finding the distribution of the rank statistic considering the scale parameter and its power function. In Section 2.3, we retrieve the joint distribution of the rank statistics considering the location and scale parameters simultaneously as well as its power function. Numerical results of a joint distribution and some powers of the rank statistics against shift parameter and scale parameter, individually and simultaneously, are presented in Section 3. We also discuss the powers of the rank statistics under the Lehmann alternatives. We end this paper with a short conclusion in Section 4.

2 Methods

2.1 Distributions of the rank statistic in the shift case

Let {X₁,…,X_m} and {Y₁,…,Y_n} be two independent samples from the continuous cumulative density distributions F(x) and G(x−θ), respectively. Given x={x₁,…,x_m} and x_[i] is the i^th smallest number in the sample, we have

p_{i} = P (x_{[i - 1]} < Y < x_{[i]}) = \int_{x_{[i - 1]}}^{x_{[i]}} g (y) dy = G (x_{[i]}) - G (x_{[i - 1]}),

for i=1,2,…,m+1 where x_[0]=−∞ and x_[m+1]=∞. Therefore, we define the sampling distribution of Y in the (m+1) intervals as

\begin{array}{lcr} p & = & (G (x_{[1]}) - G (x_{[0]}), \dots, G (x_{[m + 1]}) - G (x_{[m]})) \\ = & (p_{1}, p_{2}, \dots, p_{m + 1}) . \end{array}

(3)

Given m, for t=1,2,…,n, let

\begin{array}{lcr} Ω_{t} = \{u_{t} = (u_{1} (t), \dots, u_{m + 1} (t)) : \sum_{i = 1}^{m + 1} u_{i} (t) = t and u_{i} (t) \geq 0, i = 1, \dots, m + 1\}, \end{array}

where u_i(t) is the number of y^′s in the interval [ x_[i−1],x_[i]) among y₁,…,y_t. For each u_n=(u₁(n),⋯,u_m+1(n)), we have a corresponding rank-sum of y’s in the combined sample

\begin{array}{lcr} R_{l} (U_{n} = u_{n} | X) = \frac{\sum_{i = 1}^{m + 1} u_{i}^{2} (n) + \sum_{i = 1}^{m + 1} u_{i} (n)}{2} + \sum_{i = 1}^{m} (u_{i} (n) + 1) (\sum_{j = i + 1}^{m + 1} u_{j} (n)) . \end{array}

(4)

Theorem 1

The statistic R_lis equivalent to the statistic W_Y, which is addressed by Wilcoxon in 1945.

Proof

Let

I (x_{i}, y_{j}) = \{\begin{matrix} 1 & if x_{i} < y_{j} \\ 0 & otherwise . \end{matrix}

The rank statistic W_Y, sum of the ranks of y’s observations, can be determined by

\begin{array}{lcr} \sum_{j = 1}^{n} (\sum_{i = 1}^{m} I (x_{i}, y_{j}) + j) & = & \sum_{j = 1}^{n} \sum_{i = 1}^{m} I (x_{i}, y_{j}) + \sum_{j = 1}^{n} j \\ = & \sum_{i = 1}^{m} \sum_{j = 1}^{n} I (x_{i}, y_{j}) + \frac{n (n + 1)}{2} . \end{array}

(5)

The first summation of the first term in Equation (5) can be interpreted as the number of y observations larger than x_[i] which is $\sum_{j = i + 1}^{m + 1} u_{j} (n)$ in our expression. It is not difficult to see that $\sum_{i = 1}^{m + 1} u_{i} (n)$ equals n, the size of y sample. Therefore, the equation can be rewritten as

\begin{array}{lcr} \sum_{i = 1}^{m} (\sum_{j = i + 1}^{m + 1} u_{j} (n)) + \frac{\sum_{i = 1}^{m + 1} u_{i} {(n)}^{2} + 2 \sum_{i = 1}^{m} u_{i} (n) (\sum_{j = i + 1}^{m + 1} u_{j} (n)) + \sum_{i = 1}^{m + 1} u_{i} (n)}{2} . \end{array}

It is then easy to see that

\begin{array}{lcr} \sum_{i = 1}^{m} (u_{i} (n) + 1) (\sum_{j = i + 1}^{m + 1} u_{j} (n)) + \frac{\sum_{i = 1}^{m + 1} u_{i} {(n)}^{2} + \sum_{i = 1}^{m + 1} u_{i} (n)}{2} = R_{l} . \end{array}

Next, we demonstrate that for two random samples from the same population, the distribution of the random vector U_n is independent of the form of the distribution function.

Theorem 2

Distribution-free property of U_n.

\begin{array}{lcr} P (U_{n} = u_{n} | H_{o}) = \frac{1}{Card (Ω_{n})} = \frac{1}{(\binom{m + n}{n})} . \end{array}

(6)

Proof

We know the joint PDF of the ordered sample of x^′s is given by

f (x_{[1]}, \dots, x_{[m]}) = m! \prod_{i = 1}^{m} f (x_{i})

and, when F=G, the conditional probability of the random vector U_n given X=(x₁,x₂,…,x_m) is

\begin{array}{lcr} P (U_{n} = u_{n} | x_{1}, x_{2}, \dots, x_{m}) = \frac{n!}{\prod_{i = 1}^{m + 1} u_{i} (n)!} \prod_{i = 1}^{m + 1} {(\int_{x_{[i - 1]}}^{x_{[i]}} f (y) dy)}^{u_{i} (n)}, \end{array}

(7)

where x_[0]=−∞ and x_[m+1]=∞. By taking the expected value of the conditional probability, we have

\begin{array}{lcr} P (U_{n} = u_{n} | H_{o}) \\ = \underset{- \infty \leq x_{[1]} \leq \dots \leq x_{[m]} \leq \infty}{\int \dots \int} P (u_{n} | x_{1}, \dots, x_{m}) f (x_{[1]}, \dots, x_{[m]}) d x_{[1]} \dots d x_{[m]} \\ = \int_{- \infty}^{\infty} \int_{x_{[1]}}^{\infty} \dots \int_{x_{[m - 1]}}^{\infty} \frac{n!}{\prod_{i = 1}^{m + 1} u_{i} (n)!} {(F (x_{[1]}))}^{u_{1} (n)} {(F (x_{[2]}) - F (x_{[1]}))}^{u_{2} (n)} \\ \dots {(1 - F (x_{[m]}))}^{u_{m + 1} (n)} m! dF (x_{[1]}) \dots dF (x_{[m]}) . \end{array}

(8)

Using variable transformation, it is clear to see that the random variables F(x_[1]),…,F(x_[m]) have a Dirichlet distribution with parameters u₁(n)+1,u₂(n)+1, …,u_m+1(n)+1. Therefore, we have

\begin{array}{lcr} P (U_{n} = u_{n} | H_{o}) = \frac{n! m!}{(n + m)!} = \frac{1}{Card (Ω_{n})} \end{array}

which is independent of the distribution function.

This is the reason that the distribution of the rank statistic U_n is distribution-free under the null hypothesis. However, the distribution of the random vector U_n is discrete uniform with the mass function one over the number of possible outcomes of the random vector U_n only when assuming F=G. In other words, the distribution of the random vector U_n can be found by the traditional combinatorial analysis when F=G. Unfortunately, when F≠G, we will not be able to establish the distribution of U_n through Equation (7) as solving the multiple integral in Equation (8) is either tedious given some appropriate alternative distribution function or difficult. Our understanding is that finding the power of the test has not been solved in most cases. To overcome this situation, we bring in the finite Markov chain imbedding approach.

Let Ω_t,t=0,1,…,n, be the state space which has

\begin{array}{lcr} (\binom{m + t}{t}) \end{array}

possible states, Γ_n={0,1,…,n} be an index set, and {Z_t:t∈Γ_n} be a non-homogeneous Markov chain on the state space Ω_t. As a transition probability matrix M_t for this chain, t=1,…,n, consider

\begin{array}{lcr} {\begin{matrix} Ω_{t} \\ M_{t} = \begin{matrix} Ω_{t - 1} \end{matrix} & [p_{u_{t - 1}, u_{t}}] \end{matrix}}_{(\binom{m + t - 1}{t - 1}) \times (\binom{m + t}{t})}, \end{array}

where

\begin{array}{lcr} p_{u_{t - 1}, u_{t}} & = & P (Z_{t} = u_{t} | Z_{t - 1} = u_{t - 1}) \\ = & \{\begin{matrix} p_{i} & if u_{i} (t - 1) + 1 = u_{i} (t) and u_{j} (t - 1) = u_{j} (t) \forall j \neq i \\ 0 & otherwise \end{matrix}, \end{array}

and p_i is defined in Equation (3).

Theorem 3

R_l(U_n|X) is finite Markov chain imbeddable, and

\begin{array}{lcr} P (R_{l} (U_{n}) = r | X) = ξ (\prod_{t = 1}^{n} M_{t}) B^{'} (C_{r}), \end{array}

where $B (C_{r}) = \sum_{k : R_{l} (U_{n}) = r} e_{k}, e_{k}$ is a $1 \times (\binom{m + n}{n})$ unit row vector corresponding to state u_n, ξ(=P(Z₀=1)=1) is the initial probability and M_t, t=1,…,n, are the transition probability matrices of the imbedded Markov chain defined on the state space Ω_t.

Proof

For each u_n=(u₁(n),⋯,u_m+1(n)) in the state space Ω_n, we have a corresponding rank R_l as shown in Equation (4). Intuitively, the minimum rank r_{l
s} is n(n+1)/2 and the maximum rank r_{l
b} is n(2m+n+1)/2. In accordance with the possible values of the rank R_l, we define a finite partition {C_r:r=r_{l
s},…,r_{l
b}} such that

\begin{array}{lcr} P (Z_{n} \in C_{r} | p) = ξ (\prod_{t = 1}^{n} M_{t}) B^{'} (C_{r}) \end{array}

(9)

where $B (C_{r}) = \sum_{k : R_{l} (U_{n}) = r} e_{k}, e_{k}$ is a $1 \times (\binom{m + n}{n})$ unit row vector corresponding to state U_n, we then obtain the conditional probability of the rank R_l.

Then, the Law of Large Numbers is used to determine the probability of U_n for any continuous F and G

\frac{1}{N} \sum_{i = 1}^{N} P (U_{n} = u_{n} | X_{i}) \overset{p}{\to} P (U_{n} = u_{n})

where X_i is the i^th sample of size m from the distribution function F. It is easy to see that

\begin{array}{lcr} P (R_{l} (U_{n}) = r) = \sum_{u_{n} : R (u_{n}) = r} P (U_{n} = u_{n}) . \end{array}

(10)

To test

H_{o} : F (x) = G (x) versus H_{a} : F (x) = G (x - θ),

for some θ≠0, the power function is approximated by

\begin{array}{lcr} P (R_{l} (U_{n}) \leq r_{1 α} | H_{a}) + P (R_{l} (U_{n}) \geq r_{2 α} | H_{a}) \\ = & \sum_{r = r_{ls}}^{r_{1 α}} P (R_{l} (U_{n}) = r | H_{a}) + \sum_{r = r_{2 α}}^{r_{lb}} P (R_{l} (U_{n}) = r | H_{a}) \\ = & \sum_{r = r_{ls}}^{r_{1 α}} \sum_{u_{n} : R (u_{n}) = r} P (U_{n} = u_{n} | H_{a}) + \sum_{r = r_{2 α}}^{r_{lb}} \sum_{u_{n} : R (u_{n}) = r} P (U_{n} = u_{n} | H_{a}) \\ \approx & \sum_{r = r_{ls}}^{r_{1 α}} \sum_{u_{n} : R (u_{n}) = r} \frac{1}{N} \sum_{i = 1}^{N} P (U_{n} | H_{a}; X_{i}) + \sum_{r = r_{2 α}}^{r_{lb}} \sum_{u_{n} : R (u_{n}) = r} \frac{1}{N} \sum_{i = 1}^{N} P (U_{n} | H_{a}; X_{i}) \\ = & \frac{1}{N} (\sum_{r = r_{ls}}^{r_{1 α}} \sum_{i = 1}^{N} \sum_{u_{n} : R (u_{n}) = r} P (U_{n} | H_{a}; X_{i}) + \sum_{r = r_{2 α}}^{r_{lb}} \sum_{i = 1}^{N} \sum_{u_{n} : R (u_{n}) = r} P (U_{n} | H_{a}; X_{i})) \\ = & \frac{1}{N} \sum_{i = 1}^{N} (\sum_{r = r_{ls}}^{r_{1 α}} P (R_{l} (U_{n}) = r | H_{a}; X_{i}) + \sum_{r = r_{2 α}}^{r_{lb}} P (R_{l} (U_{n}) = r | H_{a}; X_{i})), \end{array}

where

\begin{array}{lcr} P (R_{l} (U_{n}) \leq r_{1 α} | H_{o}) + P (R_{l} (U_{n}) \geq r_{2 α} | H_{o}) \leq α. \end{array}

Note that the alternative hypothesis is subject to the purpose of the test. This simply needs to be slightly modified if a one-sided test is adopted.

2.2 Distributions of the rank statistic in the scale case

We studied the distribution and the power function of the rank statistic R_l considering a shift in location. Now, the distribution and the power function of the rank statistic considering the scale parameter will be addressed. For this purpose, we consider F(x)=G(x σ⁻¹) and state the null and alternative hypotheses as

H_{o} : σ = 1 versus H_{a} : σ \neq 1 .

To do so, we begin with the procedure of finding the distribution of the rank statistic, denoted R_s, considering the scale parameter through the random vector U_n. The array of ranks are given by

(m + n) / 2, \dots, 3, 2, 1, 1, 2, 3, \dots, (m + n) / 2;

if m+n is even, and

(m + n - 1) / 2, \dots, 3, 2, 1, 0 1, 2, 3, \dots, (m + n - 1) / 2

if m+n is odd. We first introduce how to determine the rank-sum of y^′s observations in the combined samples, R_s, with respect to

Ω_{n} = \{u_{n} = (u_{1} (n), \dots, u_{m + 1} (n)) : \sum_{i = 1}^{m + 1} u_{i} (n) = n\}

where u_i(n) means the number of y observations belonging to [ x_[i−1],x_[i]). Let m e d(x,y) be the median among x^′s and y^′s and belongs to [ x_[i],x_[i+1]) which will then break U_n into two parts $U_{n}^{-}$ and $U_{n}^{+}$ . If m+n is odd and m e d(x,y)=x_[i], then

U_{n}^{-} = (u_{1}^{-} = u_{i} (n), u_{2}^{-} = u_{i - 1} (n), \dots, u_{i}^{-} = u_{1} (n))

is a 1×i vector and

U_{n}^{+} = (u_{1}^{+} = u_{i + 1} (n), u_{2}^{+} = u_{i + 2} (n), \dots, u_{m + 1 - i}^{+} = u_{m + 1} (n))

is a 1×(m+1−i) vector. The second possible case is, if m+n is odd and $med (x, y) = y_{[\sum_{k = 1}^{i} u_{k} (n) + j]}$ , then $U_{n}^{-}$ , a row vector with length i+1, has the form

(u_{1}^{-} = j - 1, u_{2}^{-} = u_{i} (n), \dots, u_{i + 1}^{-} = u_{1} (n))

and $U_{n}^{+}$ , a row vector with length m+1−i, is given by

(u_{1}^{+} = u_{i + 1} (n) - j, u_{2}^{+} = u_{i + 2} (n), \dots, u_{m + 1 - i}^{+} = u_{m + 1} (n)) .

The third possible case is, if m+n is even and x_[i] is the smallest number larger than m e d(x,y), the vectors are now defined as

U_{n}^{-} = (u_{1}^{-} = u_{i} (n), u_{2}^{-} = u_{i - 1} (n), \dots, u_{i}^{-} = u_{1} (n))

and

U_{n}^{+} = (u_{1}^{+} = 0, u_{2}^{+} = u_{i + 1} (n), \dots, u_{m + 2 - i}^{+} = u_{m + 1} (n)) .

The last possibility is, if m+n is even, $y_{[\sum_{k = 1}^{i} u_{k} (n) + j]}$ is the smallest number larger than m e d(x,y). The vectors are now defined as

U_{n}^{-} = (u_{1}^{-} = j - 1, u_{2}^{-} = u_{i} (n), \dots, u_{i + 1}^{-} = u_{1} (n))

and

U_{n}^{+} = (u_{1}^{+} = u_{i + 1} (n) - j + 1, u_{2}^{+} = u_{i + 2} (n), \dots, u_{m + 1 - i}^{+} = u_{m + 1} (n)) .

Let n⁻ be the length of the vector $U_{n}^{-}$ and n⁺ be the length of the vector $U_{n}^{+}$ .

Theorem 4

R_s(U_n|X) is finite Markov chain imbeddable, and

\begin{array}{lcr} P (R_{s} (U_{n}) = r | X) = ξ (\prod_{t = 1}^{n} M_{t}) B^{'} (C_{r}), \end{array}

where $B (C_{r}) = \sum_{k : R_{s} (U_{n}) = r} e_{k}, e_{k}$ is a $1 \times (\binom{m + n}{n})$ unit row vector corresponding to state U_n, ξ(=P(Z₀=1)=1) is the initial probability and M_t, t=1,…,n are the transition probability matrices of the imbedded Markov chain defined on the state space Ω_t.

Proof

For each U_n in the state space Ω_n, we have a corresponding

\begin{array}{lcr} R_{s} (U_{n} | X) & = & R_{s} (U_{n}^{-} | X) + R_{s} (U_{n}^{+} | X) \\ = & \frac{\sum_{k = 1}^{n^{-}} {(u_{k}^{-})}^{2} + \sum_{k = 1}^{n^{-}} u_{k}^{-}}{2} + \sum_{k = 1}^{n^{-} - 1} (u_{k}^{-} + 1) (\sum_{j = k + 1}^{n^{-}} u_{j}^{-}) \\ + \frac{\sum_{k = 1}^{n^{+}} {(u_{k}^{+})}^{2} + \sum_{k = 1}^{n^{+}} u_{k}^{+}}{2} + \sum_{k = 1}^{n^{+} - 1} (u_{k}^{+} + 1) (\sum_{j = k + 1}^{n^{+}} u_{j}^{+}) . \end{array}

(11)

The smallest possible value of R_s(U_n) is

\begin{array}{lcr} r_{ss} = \{\begin{array}{l} \frac{n (n + 2)}{4} & if m + n is even and n is even \\ \frac{(n + 1) (n + 3)}{4} & if m + n is even and n is odd \\ \frac{n^{2}}{4} & if m + n is odd and n is even \\ \frac{(n + 1) (n - 1)}{4} & if m + n is odd and n is odd \end{array} \end{array}

(12)

and the largest possible value is

\begin{array}{lcr} r_{sb} = \{\begin{array}{l} \frac{n (2 m + n + 2)}{4} & if m + n is even and n is even \\ \frac{n (2 m + n + 2) - 1}{4} & if m + n is even and n is odd \\ \frac{n (2 m + n - 1)}{4} & if m + n is odd and n is even \\ \frac{n (2 m + n) - 1}{4} & if m + n is odd and n is odd \end{array} \end{array}

(13)

In accordance with Equation (11), we use the possible value of R_s as a rule of the partition. The rest of the proof follows along the same line as that of Theorem 3, and here, is omitted.

Similarly, we apply the LLN to conclude that

\frac{1}{N} \sum_{i = 1}^{N} P (R_{s} | X_{i}) \overset{p}{\to} P (R_{s})

which establishes the distribution of R_s.

Through FMCI we, again, successfully retrieved the distribution of R_s under selected alternative distributions, for which the procedures are similar to those in the previous section. In addition, it is quite intuitive to approximate the power function by

\begin{array}{lcr} \frac{1}{N} \sum_{i = 1}^{N} (\sum_{s = r_{ss}}^{s_{1 α}} P (R_{s} (U_{n}) = s | X_{i}) + \sum_{s = s_{2 α}}^{r_{sb}} P (R_{s} (U_{n}) = s | X_{i})), \end{array}

where

\begin{array}{lcr} P (R_{s} (U_{n}) \leq s_{1 α} | H_{o}) + P (R_{s} (U_{n}) \geq s_{2 α} | H_{o}) \leq α. \end{array}

2.3 Joint distributions of the rank statistics in the shift and scale case

We have derived the marginal distributions of R_l and R_s in terms of U_n, respectively, which yield the following theorem.

Theorem 5

(R_l(U_n|X),R_s(U_n|X)) is finite Markov chain imbeddable, and

\begin{array}{lcr} P (R_{l} (U_{n}) = r_{1}; R_{s} (U_{n}) = r_{2} | X) = ξ (\prod_{t = 1}^{n} M_{t}) B^{'} (C_{r}) \end{array}

where $B (C_{r}) = \sum_{k : R_{l} (U_{n}) = r_{1} & R_{s} (U_{n}) = r_{2}} e_{k}, e_{k}$ is a $1 \times (\binom{m + n}{n})$ unit row vector corresponding to state u_n, ξ(=P(Z₀=1)=1) is the initial probability and M_t, t=1,…,n are the transition probability matrices of the imbedded Markov chain defined on the state space Ω_t.

Proof

By Equations (4) and (11), we know each u_n in the state space Ω_n has corresponding values of R_l and R_s. The combinations of the values R_l and R_s are used to be the standard of the partition. The rest of the proof follows along the same line as that of Theorem 3.

The joint distribution of the ranks considering both the location and scale parameters which can be determined through our algorithm is yet to be studied in the literature. Our result allows us to test the homogeneity of the distribution functions F(x)=G((x−θ)σ⁻¹). We state the hypotheses as follows

\begin{array}{lcr} H_{o} : θ = 0 and σ = 1 v.s. H_{a} : θ \neq 0 or σ \neq 1 . \end{array}

(14)

Also we are able to identify a proper critical region under the null hypothesis and discuss its power when F≠G. For example, a rectangular critical region can be

\begin{array}{lcr} C_{α} = {R_{l} \leq r_{1 l}, R_{l} \geq r_{2 l}, R_{s} \leq r_{1 s} or R_{s} \geq r_{2 s}} \end{array}

where r_1l, r_2l, r_1s and r_2s are the critical values such that

\begin{array}{lcr} P (R_{l} \leq r_{1 l} | H_{o}) & + & P (R_{l} \geq r_{2 l} | H_{o}) + P (r_{1 l} < R_{l} < r_{2 l}, R_{s} \leq r_{1 s} | H_{o}) \\ + & P (r_{1 l} < R_{l} < r_{2 l}, R_{s} \geq r_{2 s} | H_{o}) \leq α \end{array}

or an elliptic critical region

\begin{array}{lcr} C_{α}^{'} = \{\frac{R_{l}^{2}}{a} + \frac{R_{s}^{2}}{b} > C\} \end{array}

for some positive constants a and b such that

\begin{array}{lcr} P (\frac{R_{l}^{2}}{a} + \frac{R_{s}^{2}}{b} > C | H_{o}) \leq α. \end{array}

According to the above defined rejection region, the power of the test can be found as

\begin{array}{lcr} P (R_{l} \leq r_{1 l} | H_{a}) & + & P (R_{l} \geq r_{2 l} | H_{a}) + P (r_{1 l} < R_{l} < r_{2 l}, R_{s} \leq r_{1 s} | H_{a}) \\ + & P (r_{1 l} < R_{l} < r_{2 l}, R_{s} \geq r_{2 s} | H_{a}) \end{array}

(15)

or

\begin{array}{lcr} P (\frac{R_{l}^{2}}{a} + \frac{R_{s}^{2}}{b} > C | H_{a}) . \end{array}

(16)

Note that unless having a conjecture about the values of θ and σ, we tend to use a two-sided test. However, with the knowledge of the center and shape of the distribution of interest, deciding a sectorial critical region is a better choice, for which an example is demonstrated in the numerical studies.

3 Numerical results and discussion

3.1 A joint distribution of R_land R_s

Let {X₁,…,X₅}∼N(0,1) and {Y₁,…,Y₇}∼N(θ,σ). Figure 1 gives the joint distribution of the random variables R_l and R_s under the null hypothesis of θ=0 and σ=1. The marginal distributions of R_l and R_s can be easily established from their joint distribution. Figure 1 also shows that the two random variables R_l and R_s are dependent. We construct two critical regions as shown in Figure 2, according to their joint distribution. Outside the yellow area in Figure 2 is the selected rectangular critical region C_0.1738 and outside the red shadow is the elliptic one C 0.1738′.

3.2 Powers for a joint test using R_land R_s

The alternative of interest is stated in the preceding section (see Equation (14)). The power functions of the test statistics R_l and R_s for a sequence of normally distributed populations with θ from -20 to 20 with an increment of 0.5 and σ from 1 to 10 with an increment of 1, and its reciprocal under two types of critical regions are provided in Figures 3 and 4. We adopt a two-sided test because of the selected values of the parameters. It should be slightly modified the critical region in the previous step in order to calculate the powers if a one-sided test is adopted. Both critical regions roughly perform equally well as shown in Figures 3 and 4. Figure 5 presents the performance of the two critical regions for given various parameter settings. Figures 5(a) and (b) show that given a standard deviation of 1 or a mean of 0, the powers of the two critical regions, rectangular and elliptic, are high and similar. However, when the variation of the alternative population reduces (σ=1/10) or increases (σ=10), the elliptic critical region performs better than the rectangular one as shown in Figures 5(c) and (d). Therefore, we suggest that when conducting a test for the equivalence of two distributions, an elliptic rejection area should be used.

Next, we consider the problem of determining an optimum rank test. To conduct a test of distributions equivalency, we can use either R_l or R_s as the test statistic. As mentioned earlier, the marginal distribution R_l or R_s can be easily established from their joint distribution. Figures 6 and 7 provide the power functions for the test statistics R_l and R_s at the level of significance 17.38%, respectively. Figure 7 shows that the rank test against scale parameter is badly effected by the centre of the alternative population. This was seen before by Ansari and Bradley (1960). By comparing Figures 6 and 7 with Figure 4, it seems that the joint test would be much more reliable than either R_l or R_s alone for distributions equivalence tests. A joint test for distributions equivalency would like a better option under most circumstances.

3.3 Lehmann alternatives

Consider the one-sided alternative F(x;θ,σ) > G(x;θ,σ), Lehmann (1953) proposed a test of H_o:F(x;θ,σ) = G(x;θ,σ) against H_a:F(x;θ,σ)^k= G(x;θ,σ) which is known as the family of Lehmann alternative. Note F(x;θ,σ)^k is the cumulative distribution of max1≤i≤k(x_i) when X_i∼ F and, under the alternative hypothesis, G(x;θ,σ) is stochastically larger than F(x;θ,σ). First of all, we know

\begin{array}{lcr} E_{k} (X) & = & \int_{- \infty}^{0} - G (x) dx + \int_{0}^{\infty} 1 - G (x) dx \\ > & \int_{- \infty}^{0} - F (x) dx + \int_{0}^{\infty} 1 - F (x) dx = E (X) . \end{array}

(17)

Therefore, the larger the R_l is, the stronger the evidence against the null hypothesis will be. For the variation of the distribution per se, the codomain of the density function is compressed to larger numbers; therefore, in most cases, we have V a r(X_k) < V a r(X). We then propose to reject the null hypothesis when R_s is large. For example, given F ∼ U(0,1) and G = F^k, it is easy to see

\begin{array}{lcr} \frac{E_{k + 1} (X)}{E_{k} (X)} = \frac{{(k + 1)}^{2}}{k (k + 2)} > 1 \end{array}

(18)

and

\begin{array}{lcr} \frac{{Var}_{k + 1} (X)}{{Var}_{k} (X)} = \frac{{(k + 1)}^{3}}{k (k + 2) (k + 3)} < 1 \end{array}

(19)

for all k. We first find the marginal and joint distributions of the ranks R_l and R_s in order to define critical regions for R_l and R_s individually and simultaneously. Due to the properties of the mean and variance of the alternative distribution, as shown in Equations (17), (18) and (19), we are cautious to define the critical regions. Table 1 provides powers for the tests as we choose uniform, standard Normal, student-t with 3 degrees of freedom, exponential distributions for the hypothesized distribution, a couple of different settings for sample sizes m and n, and 2, 3, 6 for k. Clearly, a joint test considering both R_l and R_s for the equality of distributions is best suited in comparison with tests considering only one of the rank statistics.

Table 1 Power comparisons for a one-sided rank test H₀:F ( x ; θ_o, σ_o) = G ( x ; θ_a, σ_a) v.s. H_a:F^k( x ; θ_o, σ_o) = G ( x ; θ_a, σ_a)

Full size table

4 Conclusion

Our proposed algorithm provides a solution for finding the power of distribution equivalence tests considering the shift and scale parameters, respectively and simultaneously. Numerical studies show that a joint test should be adopted for the test homogeneity of distributions as well as under Lehmann alternatives. Also an elliptic critical region is a better choice rather than a rectangular one for a joint test. In practice, it is reasonable to have neither the normality assumption nor equal mean/variance of the interested distributions. However, our algorithm highly depends on the technology equipments as the possible states in Ω_n grow rapidly when the sample sizes increase. Therefore, we can, so far, only target small sample sizes in our work.

References

Ansari AR, Bradley RA: Rank-Sum Tests for Dispersions. Ann. Math. Stat 1960, 31: 1174–1189. 10.1214/aoms/1177705688
Article MathSciNet Google Scholar
Collings BJ, Hamilton MA: Estimating the power of the two-sample Wilcoxon Test for location shift. Biometrics 1988, 44: 847–860. 10.2307/2531596
Article Google Scholar
Klotz J: Nonparametric test for scale. Ann. Math. Stat 1962, 33: 498–512. 10.1214/aoms/1177704576
Article MathSciNet Google Scholar
Lehmann EL: The power for rank tests. Ann. Math. Stat 1953, 24: 23–43. 10.1214/aoms/1177729080
Article MathSciNet Google Scholar
Lehmann EL: Nonparametrics: Statistical Methods Based on Ranks. Prentice-Hall, New Jersey; 1998.
Google Scholar
Mann HB, Whitney DR: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat 1947, 18: 50–60. 10.1214/aoms/1177730491
Article MathSciNet Google Scholar
Mood AM: On the asymptotic efficiency of certain nonparametric two-sample tests. Ann. Math. Stat 1954, 25: 514–522. 10.1214/aoms/1177728719
Article MathSciNet Google Scholar
Rosner B, Glynn RJ: Power and sample size estimation for the Wilcoxon rank sum test with application to comparisons of C statistics from alternative prediction models. Biometrics 2009, 65: 188–197. 10.1111/j.1541-0420.2008.01062.x
Article MathSciNet Google Scholar
Shieh G, Jan SL, Randles RH: On power and sample size determinations for the Wilcoxon-Mann-Whitney test. Nonparametric Stat 2006, 18: 33–43. 10.1080/10485250500473099
Article MathSciNet Google Scholar
Siegel S, Tukey JW: A nonparametric sum of ranks procedure for relative spread in unpaired samples. J. Am. Stat. Assoc 1960, 55: 429–445. 10.1080/01621459.1960.10482073
Article MathSciNet Google Scholar
Wilcoxon F: Individual comparisons by ranking methods. Biometrics 1945, 1: 80–83. 10.2307/3001968
Article MathSciNet Google Scholar

Download references

Acknowledgments

The author would like to thank James C. Fu and anonymous referee whose comments led to significant improvements of this manuscript.

Author information

Authors and Affiliations

Department of Statistics, University of Manitoba, Winnipeg, Canada
Wan-Chen Lee

Authors

Wan-Chen Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wan-Chen Lee.

Additional information

Competing interests

The author declares that she has no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lee, WC. Joint distribution of rank statistics considering the location and scale parameters and its power study. J Stat Distrib App 1, 6 (2014). https://doi.org/10.1186/2195-5832-1-6

Download citation

Received: 07 August 2013
Accepted: 10 February 2014
Published: 11 June 2014
DOI: https://doi.org/10.1186/2195-5832-1-6

Joint distribution of rank statistics considering the location and scale parameters and its power study

Abstract

2010 Mathematics Subject Classification

1 Introduction

2 Methods

2.1 Distributions of the rank statistic in the shift case

Theorem 1

Proof

Theorem 2

Proof

Theorem 3

Proof

2.2 Distributions of the rank statistic in the scale case

Theorem 4

Proof

2.3 Joint distributions of the rank statistics in the shift and scale case

Theorem 5

Proof

3 Numerical results and discussion

3.1 A joint distribution of R l and R s

3.2 Powers for a joint test using R l and R s

3.3 Lehmann alternatives

4 Conclusion

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

3.1 A joint distribution of R_land R_s

3.2 Powers for a joint test using R_land R_s