The main complication in proving the result is that \(W_R = R_1,\dots,R_m\) is a sum of dependent random variables rather than independent random variables. Our strategy will be to
- Introduce an approximation \(\tilde W_R\) to \(W_R\) which is a sum of independent random variables and such that we can show \[
\frac{\tilde W_R - \mathbb{E}\tilde W_R}{\sqrt{ \mathbb{V}\tilde W_R}} \overset{\text{d}}{\longrightarrow}\mathcal{N}(0,1)
\tag{30.2}\] as \(n,m \to \infty\).
- Show that the approximation of \(\tilde W_R\) to \(W_R\) is close enough for the convergence in Equation 30.2 to imply the convergence in Equation 30.1.
Here we go. Under the null hypothesis \(H_0\): \(F = G\), the ranks \(R_1,\dots,R_m\) are simply \(m\) integers sampled without replacement from \(1,\dots,N\). One way to draw a sample of size \(m\) without replacement from the integers \(1,\dots,N\) is to (i) generate \(N\) independent realizations \(U_1,\dots,U_N\) from the \(\mathcal{U}(0,1)\) distribution, (ii) denote their order statistics by \(U_{(1)} <\dots <U_{(N)}\), and (iii) keep the integers in the set \(\{i: U_i \leq U_{(m)}\}\). Under this sampling scheme we find we may write \[
W_R = \sum_{i=1}^N i J_i,
\] where \(J_i = \mathbf{1}(U_i \leq U_{(m)})\) is an indicator equal to \(1\) if the integer \(i\) was drawn and equal to zero otherwise for each \(i=1,\dots,N\). Note that \(J_1,\dots,J_N\) are dependent random variables.
Now, to construct an approximation \(\tilde W_R\) to \(W_R\), we consider replacing the dependent random variables \(J_i\) with independent analogues \(K_i\) defined as \(K_i = \mathbf{1}(U_i \leq m/N)\) for \(i=1,\dots,N\), where the \(U_1,\dots,U_N\) are the same \(\mathcal{U}(0,1)\) realizations with which the \(J_i\) were defined. Then we set \[
\tilde W_R = \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)K_i + \frac{1}{2}m(N+1)
\] Now, noting that \(\mathbb{E}K_i = m/N\) and \(\mathbb{V}K_i = m/N(1 - m/N)\), we have \[\begin{align}
\mathbb{E}\tilde W_R &=\sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)\frac{m}{N} + \frac{1}{2}m(N+1) \\
&= \frac{1}{2}m(N+1)
\end{align}\] and \[\begin{align}
\mathbb{V}\tilde W_R &= \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)^2 \frac{m}{N}\Big(1 - \frac{m}{N}\Big) =\frac{N-1}{N}\mathbb{V}W_R,
\end{align}\] where the second equality will be useful later and requires some work to show. Now we find we may write \[\begin{align}
\frac{\tilde W_R - \mathbb{E}\tilde W_R}{\sqrt{\mathbb{V}\tilde W_R}} = \frac{\sum_{i=1}^N a_i \xi_i}{\sqrt{\sum_{i=1}^N a_i^2}},
\end{align}\] where \[
a_i = i - \frac{N+1}{2} \quad \text{ and } \quad \xi_i = \frac{K_i - m/N}{\sqrt{m/N(1-m/N)}}
\] for \(i=1,\dots,N\). Note that \(\xi_1,\dots,\xi_N\) are independent random variables with zero mean and unit variance, so according to the corollary to the Lindeberg central limit theorem stated in Corollary 32.1, the convergence in Equation 30.2 holds provided \[
\frac{\max_{1\leq i \leq N} |a_i|}{\sqrt{\sum_{i=1}^Na_i^2}} \to 0
\tag{30.3}\] as \(n,m \to \infty\). We have \[
\max_{1 \leq i \leq N} |a_i| = \Big( \Big|1 - \frac{N+1}{2}\Big| \vee \Big|N - \frac{N+1}{2}\Big|\Big) = \frac{N-1}{2}
\] and we can (with some work) show \[
\sum_{i=1}^N a_i^2 = \frac{N(N^2 - 1)}{12}.
\] Thus we see that Equation 30.3 holds and the convergence in Equation 30.2 is established.
Now we consider the quality of the approximation of \(\tilde W_R\) to \(W_R\). We may write \[
\frac{W_R - \mathbb{E}W_R}{\sqrt{\mathbb{V}W_R}} = \Big(\frac{\tilde W_R - \mathbb{E}\tilde W_R}{\sqrt{\mathbb{V}\tilde W_R}} + \frac{W_R - \tilde W_R}{\sqrt{\mathbb{V}\tilde W_R}} \Big)\sqrt{\frac{\mathbb{V}\tilde W_R}{\mathbb{V}W_R}},
\] since \(\mathbb{E}\tilde W_R = \mathbb{E}W_R\). Now, since \(\mathbb{V}\tilde W_R = N^{-1}(N-1)\mathbb{V}W_R\), a sufficient condition for the convergence in Equation 30.2 to imply the convergence in Equation 30.1 is the condition \[
\frac{\mathbb{E}(W_R - \tilde W_R)^2}{\mathbb{V}\tilde W_R} \to 0
\tag{30.4}\] as \(n,m \to 0\).
First, it will be useful to write \[\begin{align}
W_R - \tilde W_R &= \sum_{i=1}^N i J_i - \sum_{i=1}^N\Big(i - \frac{N+1}{2}\Big)K_i - \frac{m(N+1)}{2} \\
&= \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)(J_i - K_i),
\end{align}\] where we have used the fact that \(\sum_{i=1}^N J_i = m\).
Now, since \(\mathbb{E}W_R = \mathbb{E}\tilde W_R\) we have \(\mathbb{E}(W_R - \tilde W_R)^2 = \mathbb{V}(W_R - \tilde W_R)\), so \[
\mathbb{E}(W_R - \tilde W_R)^2 = \mathbb{V}(\mathbb{E}[W_R - \tilde W_R | U_{(1)},\dots,U_{(N)}]) + \mathbb{E}(\mathbb{V}[W_R - \tilde W_R | U_{(1)},\dots,U_{(N)}]),
\tag{30.5}\] where \[\begin{align}
\mathbb{E}[W_R - \tilde W_R | U_{(1)},\dots,U_{(N)}] &= \mathbb{E}\Big[\sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)(J_i - K_i)| U_{(1)},\dots,U_{(N)}\Big]\\
&= \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)\mathbb{E}[(J_1 - K_1)| U_{(1)},\dots,U_{(N)}]\\
&= 0,
\end{align}\] and \[
\mathbb{V}[W_R - \tilde W_R | U_{(1)},\dots,U_{(N)}] = \mathbb{V}\Big[ \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)(J_i - K_i) | U_{(1)},\dots,U_{(N)}\Big].
\] To find the above variance, it helps to note that after conditioning on \(U_{(1)},\dots,U_{(N)}\) the values \(U_1,\dots,U_N\) on which \(K_1,\dots,K_N\) and \(J_1,\dots,J_N\) are based are a random permutation of \(U_{(1)},\dots,U_{(N)}\). We can therefore use Lemma 32.2 to write \[\begin{align}
\mathbb{V}[W_R - \tilde W_R | U_{(1)},\dots,U_{(N)}] &= \frac{1}{N-1} \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)^2\sum_{i=1}^N((J_i - K_i) - (\bar J - \bar K))^2 \\
&\leq \frac{1}{N-1} \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)^2\sum_{i=1}^N(J_i - K_i)^2,
\end{align}\] where \(\bar J = N^{-1}\sum_{i=1}^N J_i\) and \(\bar K =N^{-1}\sum_{i=1}^N K_i\), since the mean minimizes the least-squares criterion. Plugging the above result into Equation 30.5, we have \[
\mathbb{E}(W_R - \tilde W_R)^2 \leq \frac{1}{N-1} \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)^2 \mathbb{E}\sum_{i=1}^N (J_i - K_i)^2.
\] In order to take the expectation, let \(K_{(1)},\dots,K_{(N)}\) and \(J_{(1)},\dots,J_{(N)}\) denote \(K_1,\dots,K_N\) and \(J_1,\dots,J_N\) when re-ordered according to the order of \(U_1,\dots,U_N\), so that \(K_{(i)} = \mathbf{1}(U_{(i)}\leq m/N)\) and \(J_{(i)} = \mathbf{1}(U_{(i)}\leq U_{(m)})\). Now, letting \(D = \sum_{i=1}^N\mathbf{1}(U_i \leq m/N)\), we may write \[
K_{(i)} = \left\{\begin{array}{ll}
1,& i = 1,\dots,D\\
0,& i = D+1,\dots,N
\end{array}\right. \quad \text{ and } \quad J_{(i)} = \left\{\begin{array}{ll}
1,& i = 1,\dots,m\\
0,& i = m+1,\dots,N,
\end{array}\right.
\] so if \(D < m\) we have \[
\sum_{i=1}^N(J_{(i)} - K_{(i)})^2 =\sum_{i=1}^D(J_{(i)} - K_{(i)})^2+\sum_{i=D+1}^m(J_{(i)} - K_{(i)})^2+\sum_{i=m}^N(J_{(i)} - K_{(i)})^2 = m - D
\] and if \(D > m\) we have \[
\sum_{i=1}^N(J_{(i)} - K_{(i)})^2 =\sum_{i=1}^m(J_{(i)} - K_{(i)})^2+\sum_{i=m+1}^D(J_{(i)} - K_{(i)})^2+\sum_{i=D}^N(J_{(i)} - K_{(i)})^2 = D - m,
\] and if \(D = m\) we have \(\sum_{i=1}^N(J_{(i)} - K_{(i)})^2 = 0\). Putting these cases together and using the fact \(\sum_{i=1}^N (J_i - K_i)^2 = \sum_{i=1}^N (J_{(i)} - K_{(i)})^2\) gives \[
\sum_{i=1}^N (J_i - K_i)^2 = |D - n|,
\] where \(D \sim \mathcal{B}(N,m/N)\), so, using Jensen’s inequality, we have \[
\mathbb{E}\sum_{i=1}^N (J_i - K_i)^2 = \mathbb{E}|D - m| \leq \sqrt{\mathbb{E}(D - m)^2}=\sqrt{Nm/N(1-m/N)}.
\] From here we write \[\begin{align}
\frac{\mathbb{E}(W_R - \tilde W_R)^2}{\mathbb{V}\tilde W_R} &\leq \frac{(N-1)^{-1} \sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)^2 \sqrt{Nm/N(1-m/N)}}{\sum_{i=1}^N \Big(i - \frac{N+1}{2}\Big)^2 m/N\Big(1 - m/N\Big) }\\
&= \frac{N}{N-1}\sqrt{\frac{N}{m(N-m)}}\\
&\leq \left\{\begin{array}{ll}
N(N-1)^{-1}\sqrt{2/n},& m \leq N/2\\
N(N-1)^{-1}\sqrt{2/m},& m > N/2.
\end{array}\right.\\
&\to 0
\end{align}\] as \(n,m\to \infty\). This completes the proof.