# RA ch7 Multiple Regression 2

# Ch 7: Multiple Regression 2

In Ch 6, we’ve seen linear regression model with multiple variables. To resolve the limitation for testing regression coefficients, we introduce a new concept, **extra sum of squares**.

## Outline

- Extra sum of squares
- definition
- application 1 : Tests
- application 2 : coefficient of partial determination

- Standardized multiple regression model
- correlation transformation and standardized regression model
- multicollinearity issue

## 1. Extra sum of squares

### definition

__Recall__

- \(SSTO\) can be divided into \(SSE\) and \(SSR\).
- When we add prediction variables to the regression model, \(SSE\) gets decreased, while \(SSR\) gets increased.

Using the above concept, define **extra sum of squares**

\(SSR(X_2 \mid X_1) = SSE(X_1) - SSE(X_1, X_2)\),

the marginal explanation ability by adding \(X_2\) into the model containing \(X_1\). \(SSR(X_2 \mid X_1) = SSR(X_1, X_2) - SSR(X_1)\) results in the same value..

### application : Tests

**Test whether a single \(\beta_k=0\)**

Full model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i3} + \epsilon_i\)

Reduced model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \epsilon_i^\ast\)

test statistic: \(\frac{(SSE(R)-SSE(F))/(df_f - df_r)}{SSE(F)/df_f}\)

\(= \frac{(SSE(X_1,X_2)-SSE(X_1,X_2,X_3))/(n-3 - (n-4))}{SSE(X_1,X_2,X_3)/(n-4)} = \frac{SSR(X_3|X_1,X_2)/1}{SSE(X_1,X_2,X_3)/(n-4)} = \frac{MSR(X_3|X_1,X_2)}{MSE(X_1,X_2,X_3)} \sim F(1,n-4)\)

**Test whether several \(\beta_k=0\)**

Full model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i3} + \epsilon_i\)

Reduced model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \epsilon_i^\ast\)

test statistic: \(\frac{(SSE(X_1)-SSE(X_1,X_2,X_3))/(n-2 - (n-4))}{SSE(X_1,X_2,X_3)/(n-4)}\)

\(= \frac{SSR(X_2, X_3|X_1)/2}{SSE(X_1,X_2,X_3)/(n-4)} \newline = \frac{MSR(X_2,X_3|X_1)}{MSE(X_1,X_2,X_3)} \sim F(2,n-4)\)

**Test whether the slopes of \(x_k\) and \(x_l\) are the same**

Full model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i3} + \epsilon_i^\ast\)

Reduced model) \(Y_i = \beta_0 + \beta_c (X_{i1} + X_{i2}) + \beta_3 X_{i3} + \epsilon_i\) Consider \(X_{i1} + X_{i2}\) as a new variable.

test statistic: \(\frac{SSR(X_1+X_2|X_3)/1}{SSE(X_1,X_2,X_3)/(n-4)} \newline = \frac{MSR(X_1+X_2|X_3)}{MSE(X_1,X_2,X_3)} \sim F(1,n-4)\)

### application 2 : coefficient of partial determination

We can also calculate \(R^2\) which accounts for marginal contribution of \(X_2\) given that \(X_1\) is in the model. \(R_{2|1}^2 = \frac{SSR(X_2|/X_1)}{SSE(X_1)}\) \(0 \leq R_{2|1}^2 \leq 1\) and large value implies large contribution.

## 2. Standardized multiple regression model

When the predictor variables are not scaled properly, it leads to \(det(X^tX) \approx 0\), which means there is multicollinearity between predictor variables. There is a cure for this problem.

### correlation transformation and standardized regression model

**correlation transform** \(n\) : number of observations We use standardized variables \(\frac{Y-\bar{Y}}{s_Y}\) and \(\frac{X_{ik}-\bar{X_k}}{s_{X_k}}\).

When we use \(Y_i^\ast\) and \(X_{ik}^\ast\) as simple functions of \(\frac{Y-\bar{Y}}{s_Y}\) and \(\frac{X_{ik}-\bar{X_k}}{s_{X_k}}\), respectively, \(Y_i^\ast = \frac{1}{\sqrt{n-1}} \frac{Y_i-\bar{Y}}{s_Y} = \frac{Y_i-\bar{Y}}{\sqrt{\sum_i (Y_i -\bar{Y})^2}}\) where \(s_Y^2 = \frac{\sum_i (Y_i -\bar{Y})^2}{n-1}\) \(Y_{ik}^\ast = \frac{1}{\sqrt{n-1}} \frac{X_{ik}-\bar{X_k}}{s_{X_k}} = \frac{X_{ik}-\bar{X_k}}{\sqrt{\sum_i (X_{ik} -\bar{X_k})^2}}\) where \({s_{X_k}}^2 = \frac{\sum_i (X_{ik} -\bar{X_k})^2}{n-1}\)

**standardized regression model** \(Y_i^\ast = \beta_1^\ast X_{i1}^\ast + \cdots + \beta_{p-1}^\ast X_{i,p-1}^\ast + \epsilon_i^\ast\)

__Note__

- \[\beta_0^\ast = 0\]
- \[\beta_0 = \bar{Y} - \beta_1 \bar{X_1} - \cdots - \beta_{p-1} \bar{X_{p-1}}\]
- \[\beta_k = \frac{s_Y}{s_{X_k}} \beta_k^\ast, k=1,2,\cdots, p-1\]

**Property of correlation matrix of \(X\) variables** Let \(corr(X_i,X_j) = r_{ij}\).

\((X^\ast)^t X^\ast = r_{XX}\) \(r_{XX}b^\ast = r_{YX}\) \(\Rightarrow b^\ast = \frac{r_{YX}}{r_{XX}}\)

### multicollinearity issue

Why do we need to avoid multicollinearity among variables?

When we solve normal equation \(X^tX b = X^tY\), the \(det(X^tX) \approx 0\) condition makes the system highly singular. This leads to high round-off error while computation, resulting in severe error in \(b\).

How to know if the predictor variables are correlated among themselves?

- uncorrelated \(X_1\) and \(X_2\) \(r_{12}^2=0\) \(SSR(X_1|X_2) = SSR(X_1)\) or \(SSR(X_2|X_1) = SSR(X_2)\)
- perfectly correlated \(X_1\) and \(X_2\) \(r_{12}^2=1\) \((X^tX)^{-1}\) Does not exist. \(\Rightarrow\) infinitely many regression lines
- general effects of multicollinearity (\(0 < r_{12}^2 < 1\)) \(r_{12}^2 \approx 1 \Rightarrow\) increase of explanation ability is not significant.