25 The percentile bootstrap

Another approach to constructing confidence intervals for a parameter is the percentile bootstrap, which is a special case of a kind of bootstrap confidence interval called a bias-corrected and accelerated \(BC_a\) interval. The rationale behind the construction of these intervals is very different from that of the intervals we have discussed so far; rather than attempting to estimate the sampling distribution of a pivotal quantity, the \(BC_a\) approach constructs upper and lower confidence bounds somewhat more directly. For details, see DiCiccio and Efron (1996).

We will illustrate a special case of the \(BC_a\) intervals known as the percentile interval. Let \(X_1,\dots,X_n\) be independent, identically distributed random variables and let \(\hat \theta_n = \hat \theta_n(X_1,\dots,X_n)\) be an estimator of some parameter \(\theta\) such that \[ \sqrt{n}(\hat \theta_n - \theta) \overset{\text{d}}{\longrightarrow}\mathcal{N}(0,\vartheta) \] as \(n \to \infty\), for some variance \(\vartheta > 0\).

Definition 25.1 (Bootstrap percentile interval) Conditional on \(X_1,\dots,X_n\), introduce independent, identically distributed random variables \(X_1^*,\dots,X_n^*\) having cdf \(\hat F_n\), where \(\hat F_n\) is the empirical distribution of \(X_1,\dots,X_n\). Then set \(\hat \theta_n^* = \hat \theta_n(X_1^*,\dots,X_n^*)\) and let \[ \hat G_{\hat \theta_n}(x) = \mathbb{P}_*(\hat\theta_n^* \leq x) \] for all \(x \in \mathbb{R}\) be the cdf of \(\hat \theta_n^*\) conditional on \(X_1,\dots,X_n\). Then the \((1-\alpha)100\%\) percentile interval for \(\theta\) is defined as \[ \Big[\hat G_{\hat \theta_n}^{-1}(\alpha/2),\hat G_{\hat \theta_n}^{-1}(1-\alpha/2)\Big]. \tag{25.1}\]

To implement the percentile bootstrap one may choose a large \(B\) and for each \(b = 1,\dots,B\) do

Draw \(X_1^{*(b)},\dots,X_n^{*(b)}\) with replacement from \(X_1,\dots,X_n\).
Compute \(\hat \theta^{*(b)}_n= \hat \theta_n(X_1^{*(b)},\dots,X_n^{*(b)})\)

Then a Monte Carlo approximation to the bootstrap percentile interval in Equation 25.1 is given by \[ \Big[\hat \theta^{*(\lceil (\alpha/2) B\rceil)}_n,\hat \theta^{*(\lceil (1-\alpha/2) B\rceil)}_n\Big], \] after sorting the Monte Carlo realizations of \(\hat \theta^*_n\) such that \(\hat \theta^{*(1)}_n \leq \dots \leq \hat \theta^{*(B)}_n\).

This is a special, and the simplest, case of the \(BC_a\) intervals, which are described in DiCiccio and Efron (1996). This interval, under mild conditions, is first-order correct; it can be adjusted to achieve second-order correctness in some settings. Your humble statistics professor, however, prefers the approach of bootstrapping pivotal quantities to the \(BC_a\) approach, so he will not enter into any more details about the latter.

Figure 20.1 compares the performance of the percentile bootstrap confidence interval for a mean to that of the interval obtained from bootstrapping the pivotal quantity \(T_n = \sqrt{n}(\bar X_n - \mu)/S_n\). The plot shows that the percentile bootstrap does not, under the settings considered, make any improvement upon the confidence interval based on the limiting Normal distribution of the quantity \(T_n\).