28  Bootstrap confidence bands in nonparametric regression

We now consider using the bootstrap to produce confidence bands for the function \(m\) in the nonparametric regression model in Definition 27.1. We consider, as in the previous section, the constant-variance case as well as the non-constant-variance case.

First consider the case \(\sigma_1^2 = \dots \sigma_n^2 = \sigma^2\), under which the relevant pivotal quantity for a constructing confidence band is \[ T_n = \sup_{x\in[0,1]} \left|\frac{\hat m_n(x) - \mathbb{E}\hat m_n(x)}{\hat \sigma_n \sqrt{\sum_{i=1}^n W^2_{ni}(x)}}\right|, \] as discussed in the previous section. Define the cumulative distribution function of \(T_n\) as \[ G_{T_n}(x) = \mathbb{P}(T_n \leq x), \quad x \in \mathbb{R}, \] and the corresponding quantile function as \(G_{T_n}^{-1}\). Now, if \(G_{T_n}\) were known, an asymptotic \((1-\alpha)100\%\) confidence band for the function \(m(x)\) over \(x \in [0,1]\) could be constructed as the collection of intervals \[ \textstyle I_{T_n}(x) = \Big[\hat m_n(x) \pm G_{T_n}^{-1}(1-\alpha) \hat \sigma_n \sqrt{\sum_{i=1}^n W^2_{ni}(x)}\Big], \quad x \in [0,1]. \] The reason the coverage is only asymptotically \(1-\alpha\) instead of exactly \(1-\alpha\) is due to the the use of the centering \(\mathbb{E}\hat m_n(x)\) instead of \(m(x)\) in the definition of \(T_n\), but we will proceed under the assumption that the difference between these—the bias—becomes asymptotically negligible.

We define a bootstrap estimator of \(G_{T_n}\) as follows:

Definition 28.1 (Residual bootstrap estimator of \(G_{T_n}\)) Conditional on \(Y_1,\dots,Y_n\), introduce independent random variables \(\varepsilon_1^*,\dots,\varepsilon_n^*\) distributed according to the empirical distribution of the residuals \(\hat\varepsilon_i = Y_i - \hat m_n(x_i)\), \(i=1,\dots,n\). Then set \(Y_i^* = \hat m_n(x_i) + \varepsilon_i^*\) for \(i=1,\dots,n\) and for each \(x \in [0,1]\) set \(\hat m_n^*(x) = \sum_{i=1}^nW_{ni}(x)Y_i^*\). Then construct \(\hat \sigma_n^*\) as in Equation 27.3 based on \((x_1,Y_1^*),\dots,(x_n,Y_n^*)\) and set \[ T_n^* = \sup_{x\in[0,1]} \left|\frac{\hat m_n^*(x) - \mathbb{E}_* \hat m_n^*(x)}{\hat \sigma_n^* \sqrt{\sum_{i=1}^n W^2_{ni}(x)}}\right|, \] noting that \(\hat m_n^*(x) - \mathbb{E}_* \hat m_n^*(x) = \sum_{i=1}^nW_{ni}(x)\varepsilon_i^*\) for each \(x \in[0,1]\). Then the residual bootstrap estimator of \(G_{T_n}\) is defined as \[ \hat G_{T_n}(x) = \mathbb{P}_*(T^*_n \leq x), \quad x \in \mathbb{R}. \]

Having obtained the residual bootstrap estimator \(\hat G_{T_n}\) and the corresponding quantile function \(\hat G_{T_n}^{-1}\), the residual bootstrap confidence band for \(m(x)\) over \(x \in [0,1]\) is constructed as the set of intervals \[ \textstyle \hat I_{T_n}(x) = \Big[\hat m_n(x) \pm \hat G_{T_n}^{-1}(1 - \alpha) \hat \sigma_n \sqrt{\sum_{i=1}^n W^2_{ni}(x)}\Big], \quad x \in [0,1]. \tag{28.1}\] To obtain a Monte Carlo approximation to the set of intervals \(\hat I_{T_n}(x)\), \(x \in [0,1]\), one may generate independent realizations \(T_n^{*(1)},\dots,T_n^{*(B)}\) for some large \(B\), where for each \(b = 1,\dots,B\) we set \[ T_n^{*(b)} = \max_{0 \leq j \leq N} \left|\frac{\hat m_n^*(j/N) - \mathbb{E}_* \hat m_n^*(j/N)}{\hat \sigma_n^* \sqrt{\sum_{i=1}^n W^2_{ni}(j/N)}}\right| \] for some large \(N\) in order to approximate the supremum. Then, after sorting the Monte Carlo realizations of \(T_n^*\) such that \(T_n^{*(1)}\leq\dots \leq T_n^{*(B)}\), we construct the bootstrap confidence band for \(m\) by replacing \(G_{T_n}^{-1}(\alpha)\) in Equation 28.1 with \(T_n^{*(\lceil (1 - \alpha)B \rceil)}\).

In the case in which \(\sigma_1^2,\dots,\sigma_n^2\) may not all be equal, we consider the pivotal quantity \[ H_n = \sup_{x\in[0,1]} \left|\frac{\hat m_n(x) - \mathbb{E}\hat m_n(x)}{\sqrt{\sum_{i=1}^n W^2_{ni}(x)\hat \varepsilon_i^2}}\right|. \] Letting \(G_{H_n}\) represent the cumulative distribution function of \(H_n\) and \(G_{H_n}^{-1}\) its quantile function, an approximate \((1 - \alpha)100\%\) confidence band is given by the set of intervals \[ I_{H_n}(x) = \Big[\hat m_n(x) \pm G_{H_n}^{-1}(1-\alpha) \sqrt{\sum_{i=1}^n W^2_{ni}(x)\hat \varepsilon_i^2}\Big], \quad x \in [0,1]. \] As in the constant-variance case, we will estimate \(G_{H_n}^{-1}(1-\alpha)\) with a bootstrap estimator.

Definition 28.2 (Residual bootstrap estimator of \(G_{T_n}\)) Conditional on \(Y_1,\dots,Y_n\) define the residuals \(\hat\varepsilon_i = Y_i - \hat m_n(x_i)\), \(i=1,\dots,n\), and introduce independent random variables \(\varepsilon_1^{*W},\dots,\varepsilon_n^{*W}\) distributed such that (i) \(\mathbb{E}_*[ \varepsilon_i^{*W}] = 0\), (ii) \(\mathbb{E}_*[ (\varepsilon_i^{*W})^2] = \hat \varepsilon_i^2\), and (iii) \(\mathbb{E}_*[ (\varepsilon_i^{*W})^3] = \hat \varepsilon_i^3\) for \(i=1,\dots,n\). Then set \(Y_i^{*W} = \hat m_n(x_i) + \varepsilon_i^{*W}\) for \(i=1,\dots,n\) and for each \(x \in [0,1]\) set \(\hat m_n^{*W}(x) = \sum_{i=1}^nW_{ni}(x)Y_i^{*W}\). Lastly, define \(\hat \varepsilon_i^{*W} = Y_i^{*W} - \hat m_n^{*W}(x_i)\) for \(i=1,\dots,n\) and set \[ H_n^{*W} = \sup_{x\in[0,1]} \left|\frac{\hat m_n^{*W}(x) - \mathbb{E}_* \hat m_n^{*W}(x)}{\sqrt{\sum_{i=1}^n W^2_{ni}(x)(\hat \varepsilon_i^{*W}})^2}\right|, \] noting that \(\hat m_n^{*W}(x) - \mathbb{E}_* \hat m_n^{*W}(x) = \sum_{i=1}^nW_{ni}(x)\varepsilon_i^{*W}\) for each \(x \in[0,1]\). Then the wild bootstrap estimator of \(G_{H_n}\) is defined as \[ \hat G_{H_n}(x) = \mathbb{P}_*(H^{*W}_n \leq x), \quad x \in \mathbb{R}. \]

The wild bootstrap \((1-\alpha)100\%\) confidence band for \(m\) over \(x \in [0,1]\) is then constructed as the collection of intervals \[ \textstyle I_{H_n}(x) = \Big[\hat m_n(x) \pm \hat G_{H_n}^{-1}(1-\alpha) \sqrt{\sum_{i=1}^n W^2_{ni}(x)\hat \varepsilon_i^2}\Big], \quad x \in [0,1]. \tag{28.2}\] As in the constant-variance case, one can replace \(\hat G_{H_n}^{-1}(1-\alpha)\) by a Monte Carlo approximation \(H_n^{*W(\lceil (1 - \alpha) B\rceil)}\), where this the \(1-\alpha\) quantile of the empirical distribution of some number \(B\) of Monte Carlo draws of \(H_n^{*W}\).

Figure 28.1 shows the residual bootstrap confidence band from Equation 28.1 and the wild bootstrap confidence band from Equation 28.2 on a single simulated data set. The bands are based on the Nadaraya-Watson estimator under the Gaussian kernel. The method of Das, Gregory, and Lahiri (2019) was used to generate the wild bootstrap error terms.

Code
# function for computing residual or wild bootstrap confidence bands
bootCBnw <- function(X,Y,h,alpha,xlims,het = FALSE,N=500,B=500){

  n <- length(Y)
  x <- seq(xlims[1],xlims[2],length=N)
  
  # obtain function estimates at x
  Wx <- dnorm(outer(x,X,FUN="-"),0,h)
  Wx <- Wx / apply(Wx,1,sum) # scales each row
  mx_hat <- drop(Wx %*% Y)
  
  # obtain residuals
  S <- dnorm(outer(X,X,FUN="-"),0,h)
  S <- S / apply(S,1,sum)
  Yhat <- drop(S %*% Y)
  ehat <- Y - Yhat

  if(het == F){
    
    ordX <- order(X)
    sigma_hat <- sqrt(sum(diff(Y[ordX])^2) / (2 * (n - 1)))
    Mx <- Wx / sqrt(apply(Wx^2,1,sum))
    
    # residual bootstrap
    Tn_boot <- numeric(B)
    for(b in 1:B){
      
      e_boot <- ehat[sample(c(1:n),n,replace=TRUE)]
      Y_boot <- Yhat + e_boot
      sigma_hat_boot <- sqrt(sum(diff(Y_boot[ordX])^2) / (2 * (n - 1)))
      Tn_boot[b] <- max(abs(Mx %*% e_boot) / sigma_hat_boot)
      
    }
   
    Ga <- sort(Tn_boot)[ceiling((1-alpha)*B)]
    me <- Ga * sigma_hat * sqrt(apply(Wx^2,1,sum))
    lo <- mx_hat - me
    up <- mx_hat + me
    
  } else if(het == T){
    
    # wild bootstrap
    Hn_boot <- numeric(B)
    for(b in 1:B){
      
      U_boot <- 4*rbeta(n, shape1 = 1/2, shape2 = 3/2) - 1
      e_boot <- ehat * U_boot
      Y_boot <- Yhat + e_boot
      Yhat_boot <- drop(S %*% Y_boot)
      ehat_boot <- Y_boot - Yhat_boot
      
      Hn_boot[b] <- max(abs(Wx %*% e_boot / sqrt(Wx^2 %*% ehat_boot^2)))
      
    }
   
    Ga <- sort(Hn_boot)[ceiling((1-alpha)*B)]
    me <- Ga * sqrt(Wx^2 %*% ehat^2)
    lo <- mx_hat - me
    up <- mx_hat + me
    
  }
  
  output <- list(lo = lo,
                 up = up,
                 mx_hat = mx_hat,
                 x = x)
  
  return(output)
  
  
}

# generate some data
m <- function(x){sin(3* pi * x / 2)/(1 + 18 * x^2*(sign(x) + 1))}
n <- 200
X <- runif(n,-1,1)
sigma <- (1.5 - X)^2/4
error <- rnorm(n,0,sigma)
Y <- m(X) + error

# now obtain the confidence bands
N <- 300
B <- 500
h <- 0.08
alpha <- 0.05
rbootCB <- bootCBnw(X,Y,h,alpha,xlims = c(-1,1),het = F,N = N)
wbootCBhet <- bootCBnw(X,Y,h,alpha,xlims = c(-1,1),het = T, N=N)

# make plots
par(mfrow=c(1,2),mar=c(2.1,2.1,1.1,1.1),oma = c(0,2,2,0))

plot(Y~X,col="gray")
lines(rbootCB$mx_hat ~ rbootCB$x)
lines(m(rbootCB$x) ~ rbootCB$x, lty = 2)
polygon(x = c(rbootCB$x,rbootCB$x[N:1]),
        y = c(rbootCB$lo,rbootCB$up[N:1]), 
        col = rgb(.545,0,0,.4),
        border = NA)

plot(Y~X,col="gray")
lines(wbootCBhet$mx_hat ~ wbootCBhet$x)
lines(m(wbootCBhet$x) ~ wbootCBhet$x, lty = 2)
polygon(x = c(wbootCBhet$x,wbootCBhet$x[N:1]),
        y = c(wbootCBhet$lo,wbootCBhet$up[N:1]), 
        col = rgb(0,0,.545,.4),
        border = NA)
Two panels showing confidence bands for nonparametric regression functions.
Figure 28.1: Asymptotic 95% residual (left) and wild bootstrap (right) confidence bands from Equation 28.1 and Equation 28.2 based on the Nadaraya–Watson estimator (with the Gaussian kernel) for a nonparametric regression function assuming constant variance and allowing non-constant variance, respectively.