16 Additive model

Definition 16.1 (Additivity assumption) A function \(m:[0,1]^p \to \mathbb{R}\) is said to satisfy the addivity assumption if \[ m(\mathbf{x}) = m_1(x_1) + \dots + m_p(x_p) \] for all \(\mathbf{x}= (x_1,\dots,x_p) \in \mathbb{R}^p\) for some functions \(m_j:[0,1]\to\mathbb{R}\), \(j=1,\dots,p\), which we call the additive components of \(m\).

Stone (1985) argued that a good many functions \(m:[0,1]^p \to \mathbb{R}\) which are likely to arise in multiple regression could be well-approximated by additive functions.

Definition 16.2 (Additive model) The additive model for observed data \((\mathbf{x}_1,Y_1),\dots,(\mathbf{x}_n,Y_n) \in [0,1]^p \times \mathbb{R}\) is \[ Y_i = \mu + m_1(x_{i1}) + \dots + m_p(x_{ip}) + \varepsilon_i, \tag{16.1}\] for \(i = 1,\dots,n\), where \(\mu\) is an unknown mean, \(m_1,\dots,m_p\) are unknown functions on \([0,1]\), and \(\varepsilon_1,\dots,\varepsilon_n\) are independent random variables with mean \(0\) and variance \(\sigma^2\).

We must immediately consider the question of identifiability in the additive model: Are the additive components \(m_1,\dots,m_p\) uniquely defined by the additivity assumption? We find, from the following example, that they are not.

Example 16.1 (Unidentifiability of additive components) Consider the additive model in the case \(p=2\) with additive components \(m_1\) and \(m_2\). Then for some constant \(c\), set \(m_1^*(x_1) = m_1(x_1) - c\) and \(m_2^*(x_2) = m_2(x_2) + c\). Then we have \[ m_1(x_1) + m_2(x_2) = m_1^*(x_1) + m_2^*(x_2), \] for all \(\mathbf{x}= (x_1,x_2) \in \mathbb{R}^2\). This demonstrates that the additive components are not uniquely determined in the additive model.

Before we can estimate the additive components \(m_1,\dots,m_p\) in the additive model, we must remove from each one the unidentifiable constant which can shift the function up or down, as illustrated in Example 16.1. To this end we impose upon each \(m_j\) the identifiability condition \[ \frac{1}{n}\sum_{i=1}^n m_j(x_{ij}) = 0 \tag{16.2}\] for \(j = 1,\dots,p\).

Remark 16.1. Note: If the design points were iid random variables \(X_{1j},\dots,X_{nj}\) (instead of being fixed) for each \(j = 1,\dots,p\), the condition in Equation 16.2 would be stated as \(\mathbb{E}m(X_{1j}) = 0\) for all \(j =1,\dots,p\).

Under the identifiability condition in Equation 16.2, the additive model gives \(\mathbb{E}\bar Y_n = \mu\). Therefore, we will from now on assume \(Y_1,\dots,Y_n\) have been centered to have mean \(0\) and set \(\mu=0\) in the additive model in Equation 19.1. From here we will focus solely on the estimation of the additive components \(m_1,\dots,m_p\).