34 Edgeworth expansions for the bootstrap

Edgeworth expansions show that the bootstrap applied to the un-studentized pivot \(Y_n = \sqrt{n}(\bar X_n - \mu)\) estimates its sampling distribution with the same order of accuracy as the limiting Normal approximation as \(n \to \infty\).

Theorem 34.1 (First-order correctness of bootstrap for un-studentized pivot) Under the conditions of Theorem 31.1 and with \(Y^*_n\) constructed as in Definition 27.1 we have

\(\sup_{x\in\mathbb{R}}|\mathbb{P}(Y_n \leq x) - \Phi(x/\sigma)| = O(n^{-1/2})\)
\(\sup_{x\in\mathbb{R}}|\mathbb{P}_*(Y_n^* \leq x) - \mathbb{P}(Y_n \leq x)| = O(n^{-1/2})\) almost surely

as \(n \to \infty\).

Edgeworth expansions show that the bootstrap applied to the studentized pivot \(T_n\) is able to estimate its sampling distribution with greater accuracy than the limiting Normal distribution as \(n \to \infty\). This is the property of second-order correctness:

Sketch of proof of Theorem 34.1

To establish 1., fix any \(x \in \mathbb{R}\) and write \(\mathbb{P}(Y_n \leq x) = \mathbb{P}(Z_n \leq x/\sigma)\). Then the Edgeworth expansion in Theorem 31.1 gives \[ \mathbb{P}(Z_n \leq x/\sigma) = \Phi(x/\sigma) - \frac{1}{6\sqrt{n}}\frac{\mu_3}{\sigma^3}\Big((x/\sigma)^2 - 1\Big)\phi(x/\sigma) + O(n^{-1}), \] from which we have \[ \sup_{x\in\mathbb{R}}|\mathbb{P}(Y_n \leq x) - \Phi(x/\sigma)| \leq \frac{1}{6\sqrt{n}}\frac{\mu_3}{\sigma^3}\sup_{x \in \mathbb{R}}|x^2 - 1|\phi(x) + O(n^{-1}), \] where the right hand side is of order \(O(n^{-1/2})\).

To prove 2., set \(Z_n^* = \sqrt{n}(\bar X_n^* - \bar X_n)/\hat \sigma_n\), where \(\hat \sigma_n^2 = S_n^2(n-1)/n\). Then we have \(\mathbb{P}_*(Y^*_n \leq x) = P(Z_n^* \leq x /\hat \sigma_n)\). It can be shown that the bootstrap cdf \(P(Z_n^* \leq x /\hat \sigma_n)\) admits an expansion such that \[ P(Z_n^* \leq x /\hat \sigma_n) = \Phi(x/\hat \sigma_n) - \frac{1}{6\sqrt{n}}\frac{\hat \mu_{n3}}{\hat \sigma_n^3}\Big((x/\hat \sigma_n)^2 - 1\Big)\phi(x/\hat \sigma_n) + O(n^{-1}), \] almost surely as \(n \to \infty\), where \[ \hat \mu_{n3} = \mathbb{E}_*(X_1^* - \bar X_n)^3 = \frac{1}{n}\sum_{i=1}^n(X_i - \bar X_n)^3. \] This expansion is not implied by Theorem 31.1, as, conditional on the data, the bootstrap random variable \(Z^*_n\) does not satisfy Cramer’s condition (See Hall (2013)). From here we may write \[\begin{align} \sup_{x \in \mathbb{R}}&|\mathbb{P}_*(Y^*_n \leq x) - \mathbb{P}(Y_n \leq x)| \leq \sup_{x \in \mathbb{R}}|\Phi(x/\sigma) - \Phi(x/\hat \sigma_n)| \\ &\quad ~+ \frac{1}{6\sqrt{n}}\sup_{x \in \mathbb{R}}\Big| \frac{\mu_3}{\sigma^3}\Big((x/\sigma)^2- 1\Big)\phi(x/\sigma) - \frac{\hat \mu_{n3}}{\hat \sigma_n^3}\Big((x/\hat \sigma_n)^2- 1\Big)\phi(x/\hat \sigma_n) \Big| + O(n^{-1}). \end{align}\] Since \(\hat \mu_{3n} \to \mu_3\) and \(\hat \sigma_n \to \sigma\) almost surely as \(n \to \infty\), the second term on the right hand side is of order \(o(n^{-1/2})\). This first term on the right hand side is of order \(O(n^{-1/2})\) almost surely, giving the result.

See page 536 of Athreya and Lahiri (2006) for a similar proof.

Theorem 34.2 (Second-order correctness of bootstrap for studentized pivot) Under the conditions of Theorem 31.1 and with \(T^*_n\) constructed as in Definition 30.1 we have

\(\sup_{x\in\mathbb{R}}|\mathbb{P}(T_n \leq x) - \Phi(x)| = O(n^{-1/2})\)
\(\sup_{x\in\mathbb{R}}|\mathbb{P}_*(T_n^* \leq x) - \mathbb{P}(T_n \leq x)| = O(n^{-1})\) almost surely

as \(n \to \infty\).

Note the difference in the orders: The asymptotic distribution approximates the sampling distribution of \(T_n\) with an error of magnitude \(O(n^{-1/2})\), whereas the bootstrap achieves an error of magnitude \(O(n^{-1})\) almost surely. This shows clearly the superiority of the bootstrap over the asymptotic approximation in the case of the studentized pivot \(T_n\).

Compare the result in Theorem 34.2 to the first equation on pg 240 of Hall (2013) following Theorem 5.1.

Sketch of proof of Theorem 34.2

By the Edgeworth expansion in Theorem 31.2 we have \[ \mathbb{P}( T_n \leq x) = \Phi(x) +\frac{1}{6\sqrt{n}}\frac{\mu_3}{\sigma^3}(2x^2 + 1) + O(n^{-1}) \] for any \(x \in \mathbb{R}\). It can be shown that the conditional cdf of \(T_n^*\) given \(X_1,\dots,X_n\) admits a similar expansion such that \[ \mathbb{P}_*( T_n^* \leq x) = \Phi(x) +\frac{1}{6\sqrt{n}}\frac{\hat \mu_{n3}}{\hat \sigma_n^3}(2x^2 + 1) + O(n^{-1}) \] almost surely as \(n \to \infty\) for any \(x\in \mathbb{R}\). From here we have \[ \sup_{x\in\mathbb{R}}|\mathbb{P}_*(T_n^* \leq x) - \mathbb{P}(T_n \leq x)| \leq \frac{1}{6\sqrt{n}}\Big|\frac{\hat \mu_{n3}}{\hat \sigma_n^3} - \frac{\mu_3}{ \sigma_n^3}\Big| \sup_{x \in \mathbb{R}}\Big|(2x^2 + 1)\phi(x)\Big| + O(n^{-1}). \] almost surely as \(n \to \infty\). Now, since Since \(\hat \mu_{n3} \to \mu_3\) and \(\hat \sigma_n \to \sigma\) at the rate \(n^{-1/2}\) almost surely, we see that the right hand side is of the order \(O(n^{-1})\) almost surely.

See page 536 of Athreya and Lahiri (2006) for a similar proof.

From the sketch of the proof of Theorem 34.2, we see that the bootstrap applied to the studentized pivotal quantity \(T_n\) is essentially able to supply the first-order Edgeworth expansion of its distribution. For this reason inferences based on the bootstrap distribution of \(T^*_n\) are able to outperform those based on the asymptotic limiting distribution of \(T_n\) as well as those based on the bootstrap distribution of the un-studentized pivot \(Y^*_n\).