32  Miscellaneous results

Here we collect some miscellaneous definitions and results, some of which are adapted from Athreya and Lahiri (2006).

Lemma 32.1 For a function \(f:\mathcal{X}\times\mathcal{Y}\to \mathbb{R}\) we have \[ \sup_{y \in \mathcal{Y}} \inf_{x \in \mathcal{X}}f(x,y) \leq \inf_{x \in \mathcal{X}} \sup_{y \in \mathcal{Y}} f(x,y). \]

Imagine a matrix of real numbers and consider proving the assertion that the maximum of the row minima cannot exceed the minimum of the column maxima. Consider the minimum entry of any row; the column in which this entry lies has a maximum greater than or equal to this value. Moreover, each other column contains a value greater than or equal to this value (by virtue of its being the minimum in its row), so the maxima of these other columns are all greater than or equal to this value. Thus the maximum of the row minima is a lower bound for the minimum of the column maxima.

Analogously, looking at \(\mathcal{Y}\) as “the rows” and \(\mathcal{X}\) as “the columns”, for any \(y^*\in \mathcal{Y}\), we have \[ \inf_{x \in \mathcal{X}} f(x,y^*) \leq \inf_{x \in \mathcal{X}} \sup_{y \in \mathcal{Y}}f(x,y) \] Now, for every \(\epsilon > 0\) there exists a \(y^*\) such that \[ \sup_{y \in \mathcal{Y}} \inf_{x \in \mathcal{X}} f(x,y) - \epsilon < \inf_{x \in \mathcal{X}} f(x,y^*), \] which gives \[ \sup_{y \in \mathcal{Y}} \inf_{x \in \mathcal{X}} f(x,y) < \inf_{x \in \mathcal{X}} \sup_{y \in \mathcal{Y}}f(x,y) + \epsilon \] for all \(\epsilon > 0\), proving the result.

Definition 32.1 (\(\ell_p\)-norm of a random variable) The \(\ell_p\)-norm of a random variable \(X\) is defined as \[ |X|_p \equiv (\mathbb{E}|X|^p)^{1/p} \] for \(p \in(0,\infty)\).

Proposition 32.1 (Minkowski’s inequality) For any random variables \(X,Y \in \mathbb{R}\) and any \(p \in (1,\infty)\) we have \[ |X - Y|_p \leq |X|_p + |Y|_p. \]

Proposition 32.2 (Jensen’s inequality) If \(g: \mathbb{R}\to \mathbb{R}\) is a convex function, then for any random variable \(X\) we have \[ g(\mathbb{E}X) \leq \mathbb{E}g(X) \] provided \(\mathbb{E}|X| < \infty\) and \(\mathbb{E}|g(X)| < \infty\).

Proposition 32.3 (Kolmogorov’s strong law of large numbers) For \(X_1,\dots,X_n\) identically distributed we have \[ \bar X_n \overset{\text{a.s.}}{\longrightarrow}c \] for some \(c \in \mathbb{R}\) if and only if \(\mathbb{E}|X_1| < \infty\), in which case \(c =\mathbb{E}X_1\).

Theorem 32.1 (Multinomial theorem) For positive integers \(n\) and \(m\) we have \[ (a_1 + \dots + a_m)^n = \sum \Big(\frac{n!}{n_1!\cdots n_m!}\Big) a_1^{n_1}\cdots a_m^{n_m}, \] where the sum is taken over all \((n_1,\dots,n_m) \in \{0,\dots,n\}^m\) such that \(n_1 + \dots + n_m = n\)

The following statement of the Lindeberg central limit theorem is adapted from Theorem 11.1.1 of Athreya and Lahiri (2006).

Theorem 32.2 (Lindeberg central limit theorem) For each \(n \geq 1\) let \(U_1,\dots,U_n\) be a collection of independent random variables with zero mean and finite variances and define \(V_1,\dots,V_n\) such that \[ V_i = \Big(\sum_{j=1}^n \mathbb{V}U_j\Big)^{-1/2}U_{i} \] for \(i = 1,\dots,n\). Then \[ \sum_{i=1}^n V_i \overset{\text{d}}{\longrightarrow}\mathcal{N}(0,1) \] as \(n \to \infty\) provided \[ \sum_{i=1}^n \mathbb{E}|V_i|^2 \mathbf{1}( | V_i| > \epsilon) \to 0 \tag{32.1}\] as \(n \to \infty\) for every \(\epsilon > 0\).

The proof follows pages 345–347 of Athreya and Lahiri (2006), but drops the triangular array notation.

For each \(n \geq 1\), set \(\sigma^2_i = \mathbb{V}U_i\) for \(i=1,\dots,n\) and suppose (without loss of generality) that \(\sum_{i=1}^n \sigma^2_i =1\), so that \(V_i = U_i\). Given that the Lindeberg condition Equation 32.1 is satisfied, we may choose a sequence \(\epsilon_n \to 0\) such that \[ \sum_{i=1}^n \mathbb{E}|V_i|^2 \mathbf{1}(|V_i|>\epsilon_n) \to 0. \tag{32.2}\] From here we will show that the characteristic function of \(\sum_{i=1}^n V_i\) converges to that of the standard Normal distribution, which is given by \(\psi_Z(t) = \exp(-t^2/2)\). Letting \(\psi_i\) represent the characteristic function of \(V_i\) for each \(i=1,\dots,n\), we have \[\begin{align} \Big|\mathbb{E}&\exp \left( \iota t \sum_{j=1}^n V_i \right) - \exp\left(-\frac{t^2}{2}\right)\Big| \\ & \leq \left| \prod_{i=1}^n \psi_i(t) - \prod_{i=1}^n \left(1 - \frac{t^2\sigma_i^2}{2}\right)\right| \\ & \quad \quad + \left|\prod_{i=1}^n \left(1 - \frac{t^2\sigma_i^2}{2}\right) - \prod_{i=1}^n\exp\left(-\frac{t^2\sigma_i^2}{2}\right) \right| \\ & \leq \sum_{i=1}^n\left| \psi_i(t) - \left(1 - \frac{t^2\sigma_i^2}{2}\right)\right| \\ & \quad \quad + \sum_{i=1}^n\left|\exp\left(-\frac{t^2\sigma_i^2}{2}\right) - \left(1 - \frac{t^2\sigma_i^2}{2} \right) \right|\\ & = A_n + B_n, \end{align}\] say, for any \(t \in \mathbb{R}\), where the second inequality comes from Lemma 11.1.3 of Athreya and Lahiri (2006). We show that \(A_n\) and \(B_n\) go to zero as \(n \to \infty\). Since \(|\exp(\iota x) - (1 + \iota x + (\iota x)^2/2)| \leq \min\{|x|^3/3!,|x|^2\}\) for all \(x\in\mathbb{R}\), for all \(t\in \mathbb{R}\), assuming \(\epsilon_n < 1\), we have \[\begin{align} A_n & := \sum_{i=1}^n\left| \psi_i(t) - \left(1 - \frac{t^2\sigma_i^2}{2}\right)\right| \\ & = \sum_{i=1}^n\left| \mathbb{E}\exp(\iota t V_i) - \left(1 + \mathbb{E}\iota t V_i + \frac{(\iota t)^2}{2!}\mathbb{E}V_i^2 \right)\right| \\ & \leq \sum_{i=1}^n \mathbb{E}\min \left\{ \frac{|t V_i|^3}{3!} , |t V_i|^2 \right\} \\ & \leq \sum_{i=1}^n \mathbb{E}|t V_i|^3\mathbf{1}(|V_i \leq \epsilon_n|) + \sum_{i=1}^n \mathbb{E}|t V_i|^2 \mathbf{1}(|V_i|>\epsilon_n) \\ & \leq t^3 \epsilon_n \sum_{i=1}^n \mathbb{E}V_i^2 + t^2\sum_{i=1}^n\mathbb{E}|V_i|^2\mathbf{1}(|V_i| > \epsilon_n)\\ & \to 0 \text{ as $n \to \infty$,} \end{align}\] since \(\sum_{i=1}^n \mathbb{E}V_i^2=1\) and \(\epsilon_n \to 0\) and by Equation 32.2. Now, since \(|e^x - 1 - x | \leq x^2e^{|x|}\) for all \(x \in \mathbb{R}\), we may write \[\begin{align} B_n &:= \sum_{i=1}^n\left|1 - \frac{t^2\sigma_i^2}{2} - \exp\left(-\frac{t^2\sigma_i^2}{2}\right) \right| \\ &\leq \sum_{i=1}^n \left(\frac{t^2\sigma_i^2}{2}\right) \exp\left( \frac{t^2\sigma_i^2}{2}\right) \\ & \leq \frac{t^4}{4}\left(\max_{1\leq i \leq n}\sigma_i^2\right)\exp\left[\frac{t^2}{2}\left(\max_{1\leq i \leq n}\sigma_i^2\right)\right]\sum_{i=1}^n \sigma_i^2 \\ & \leq t^4\left(\max_{1\leq i \leq n}\sigma_i^2\right)\exp\left[t^2\left(\max_{1\leq i \leq n}\sigma_i^2\right)\right]. \end{align}\] Lastly, we have \[\begin{align*} \max_{1 \leq i \leq n}\sigma_i^2 & = \max_{1 \leq i \leq n} \mathbb{E}V_i^2 \\ & = \max_{1\leq i \leq n} \mathbb{E}\left[ |V_i|^2\mathbf{1}(|V_i|\leq\epsilon_n) + |V_i|^2\mathbf{1}(|V_i| > \epsilon_n)\right] \\ & \leq \epsilon_n^2 + \sum_{i=1}^n \mathbb{E}|V_i|^2\mathbf{1}(|V_i| > \epsilon_n)\\ & \to 0 \text{ as $n \to \infty$,} \end{align*}\] by Equation 32.2. This completes the proof.

Corollary 32.1 (Corollary to the Lindeberg central limit theorem) For each \(n \geq 1\), let \(\xi_1,\dots,\xi_n\) be independent random variables with zero mean and unit variance and let \(a_1,\dots,a_n \in \mathbb{R}\) be a collection of real numbers. Then \[ \Big(\sum_{i=1}^n a_i ^2\Big)^{-1/2}\sum_{i = 1}^n a_i \xi_i \overset{\text{d}}{\longrightarrow}\mathcal{N}(0,1) \] as \(n \to \infty\) provided \[ \Big(\sum_{j=1}^n a_j ^2\Big)^{-1/2}\max_{1 \leq i \leq n} |a_i| \to 0 \tag{32.3}\] as \(n \to \infty\).

For each \(n \geq 1\), let \(U_i = a_i\xi_i\), so that \(\mathbb{V}U_i = a_i^2\) for \(i =1,\dots,n\). Accordingly, set \(V_i = (\sum_{i=1}^n a_i^2)^{-1/2}a_i \xi_i\). Now we show that the collections of variables \(V_1,\dots,V_n\), \(n \geq 1\), satisfy the Lindeberg condition in Equation 32.1. We have \[\begin{align} \sum_{i=1}^n &\mathbb{E}|V_i|^2 \mathbf{1}(|V_i| > \epsilon) \\ &= \sum_{i=1}^n \frac{a_i^2 }{\sum_{j=1}^n a_j^2}\mathbb{E}|\xi_i|^2 \mathbf{1}\Big(|a_i \xi_i| > \epsilon (\textstyle \sum_{j=1}^na_j^2)^{1/2}\Big)\\ &\leq \sum_{i=1}^n \frac{a_i^2 }{\sum_{j=1}^na_j^2}\mathbb{E}|\xi_1|^2 \mathbf{1}\Big(|\xi_1|\max_{1\leq i \leq n}|a_i| > \epsilon \textstyle (\sum_{j=1}^n a_j^2)^{1/2}\big)\\ &= \mathbb{E}|\xi_1|^2 \mathbf{1}\Big( |\xi_1| > \epsilon \frac{(\sum_{j=1}^na_j^2)^{1/2}}{\max_{1\leq i \leq n}|a_i|}\Big)\\ &\to 0 \end{align}\] by the dominated convergence theorem, since \(\mathbb{E}|\xi_1|^2 < \infty\) and by the condition in Equation 32.3.

The following result is taken from page 365 of Lehmann (1975).

Lemma 32.2 (Mean and variance of linear combination of randomly permuted constants) Let \(c_1,\dots,c_N\) and \(a(1),\dots,a(N)\) be constants and let \(T_1,\dots,T_N\) be a random permutation of the integers \(1,\dots,N\). Then \[\begin{align} \mathbb{E}\Big(\sum_{i=1}^N c_i a(T_i)\Big) &= \bar a \sum_{i=1}^N c_i\\ \mathbb{V}\Big(\sum_{i=1}^N c_i a(T_i)\Big) &= (N-1)^{-1}\sum_{i=1}^N(c_i - \bar c)^2\sum_{i=1}^N(a(i) - \bar a)^2, \end{align}\] where \(\bar a = N^{-1}\sum_{i=1}^N a(i)\) and \(\bar c = N^{-1}\sum_{i=1}^N c_i\).