3  Additive model

Definition 3.1 (Additivity assumption) A function \(m:[0,1]^p \to \mathbb{R}\) is said to satisfy the addivity assumption if \[ m(\mathbf{x}) = m_1(x_1) + \dots + m_p(x_p) \] for all \(\mathbf{x}= (x_1,\dots,x_p) \in \mathbb{R}^p\) for some functions \(m_j:[0,1]\to\mathbb{R}\), \(j=1,\dots,p\), which we call the additive components of \(m\).

Stone (1985) argued that a good many functions \(m:[0,1]^p \to \mathbb{R}\) which are likely to arise in multiple regression could be well-approximated by additive functions.

Definition 3.2 (Additive model) The additive model for observed data \((\mathbf{x}_1,Y_1),\dots,(\mathbf{x}_n,Y_n) \in [0,1]^p \times \mathbb{R}\) is \[ Y_i = \mu + m_1(x_{i1}) + \dots + m_p(x_{ip}) + \varepsilon_i, \tag{3.1}\] for \(i = 1,\dots,n\), where \(\mu\) is an unknown mean, \(m_1,\dots,m_p\) are unknown functions on \([0,1]\), and \(\varepsilon_1,\dots,\varepsilon_n\) are independent random variables with mean \(0\) and variance \(\sigma^2\).

We must immediately consider the question of identifiability in the additive model: Are the additive components \(m_1,\dots,m_p\) uniquely defined by the additivity assumption? We find, from the following example, that they are not.

Example 3.1 (Unidentifiability of additive components) Consider the additive model in the case \(p=2\) with additive components \(m_1\) and \(m_2\). Then for some constant \(c\), set \(m_1^*(x_1) = m_1(x_1) - c\) and \(m_2^*(x_2) = m_2(x_2) + c\). Then we have \[ m_1(x_1) + m_2(x_2) = m_1^*(x_1) + m_2^*(x_2), \] for all \(\mathbf{x}= (x_1,x_2) \in \mathbb{R}^2\). This demonstrates that the additive components are not uniquely determined in the additive model.

Before we can estimate the additive components \(m_1,\dots,m_p\) in the additive model, we must remove from each one the unidentifiable constant which can shift the function up or down, as illustrated in Example 3.1. To this end we impose upon each \(m_j\) the identifiability condition \[ \frac{1}{n}\sum_{i=1}^n m_j(x_{ij}) = 0 \tag{3.2}\] for \(j = 1,\dots,p\).

Remark 3.1. Note: If the design points were iid random variables \(X_{1j},\dots,X_{nj}\) (instead of being fixed) for each \(j = 1,\dots,p\), the condition in Equation 3.2 would be stated as \(\mathbb{E}m(X_{1j}) = 0\) for all \(j =1,\dots,p\).

Under the identifiability condition in Equation 3.2, the additive model gives \(\mathbb{E}\bar Y_n = \mu\). Therefore, we will from now on assume \(Y_1,\dots,Y_n\) have been centered to have mean \(0\) and set \(\mu=0\) in the additive model in Equation 6.1. From here we will focus solely on the estimation of the additive components \(m_1,\dots,m_p\).