RA ch7 Multiple Regression 2
Ch 7: Multiple Regression 2
In Ch 6, we’ve seen linear regression model with multiple variables. To resolve the limitation for testing regression coefficients, we introduce a new concept, extra sum of squares.
Outline
- Extra sum of squares
- definition
- application 1 : Tests
- application 2 : coefficient of partial determination
- Standardized multiple regression model
- correlation transformation and standardized regression model
- multicollinearity issue
1. Extra sum of squares
definition
Recall
- \(SSTO\) can be divided into \(SSE\) and \(SSR\).
- When we add prediction variables to the regression model, \(SSE\) gets decreased, while \(SSR\) gets increased.
Using the above concept, define extra sum of squares
\(SSR(X_2 \mid X_1) = SSE(X_1) - SSE(X_1, X_2)\),
the marginal explanation ability by adding \(X_2\) into the model containing \(X_1\). \(SSR(X_2 \mid X_1) = SSR(X_1, X_2) - SSR(X_1)\) results in the same value..
application : Tests
Test whether a single \(\beta_k=0\)
\[H_0: \beta_3 = 0\]Full model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i3} + \epsilon_i\)
Reduced model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \epsilon_i^\ast\)
test statistic: \(\frac{(SSE(R)-SSE(F))/(df_f - df_r)}{SSE(F)/df_f}\)
\(= \frac{(SSE(X_1,X_2)-SSE(X_1,X_2,X_3))/(n-3 - (n-4))}{SSE(X_1,X_2,X_3)/(n-4)} = \frac{SSR(X_3|X_1,X_2)/1}{SSE(X_1,X_2,X_3)/(n-4)} = \frac{MSR(X_3|X_1,X_2)}{MSE(X_1,X_2,X_3)} \sim F(1,n-4)\)
Test whether several \(\beta_k=0\)
\[H_0: \beta_2 = \beta_3 = 0\]Full model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i3} + \epsilon_i\)
Reduced model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \epsilon_i^\ast\)
test statistic: \(\frac{(SSE(X_1)-SSE(X_1,X_2,X_3))/(n-2 - (n-4))}{SSE(X_1,X_2,X_3)/(n-4)}\)
\(= \frac{SSR(X_2, X_3|X_1)/2}{SSE(X_1,X_2,X_3)/(n-4)} \newline = \frac{MSR(X_2,X_3|X_1)}{MSE(X_1,X_2,X_3)} \sim F(2,n-4)\)
Test whether the slopes of \(x_k\) and \(x_l\) are the same
\[H_0: \beta_2 = \beta_3 = \beta_c\]Full model) \(Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \beta_3 X_{i3} + \epsilon_i^\ast\)
Reduced model) \(Y_i = \beta_0 + \beta_c (X_{i1} + X_{i2}) + \beta_3 X_{i3} + \epsilon_i\) Consider \(X_{i1} + X_{i2}\) as a new variable.
test statistic: \(\frac{SSR(X_1+X_2|X_3)/1}{SSE(X_1,X_2,X_3)/(n-4)} \newline = \frac{MSR(X_1+X_2|X_3)}{MSE(X_1,X_2,X_3)} \sim F(1,n-4)\)
application 2 : coefficient of partial determination
We can also calculate \(R^2\) which accounts for marginal contribution of \(X_2\) given that \(X_1\) is in the model. \(R_{2|1}^2 = \frac{SSR(X_2|/X_1)}{SSE(X_1)}\) \(0 \leq R_{2|1}^2 \leq 1\) and large value implies large contribution.
2. Standardized multiple regression model
When the predictor variables are not scaled properly, it leads to \(det(X^tX) \approx 0\), which means there is multicollinearity between predictor variables. There is a cure for this problem.
correlation transformation and standardized regression model
correlation transform \(n\) : number of observations We use standardized variables \(\frac{Y-\bar{Y}}{s_Y}\) and \(\frac{X_{ik}-\bar{X_k}}{s_{X_k}}\).
When we use \(Y_i^\ast\) and \(X_{ik}^\ast\) as simple functions of \(\frac{Y-\bar{Y}}{s_Y}\) and \(\frac{X_{ik}-\bar{X_k}}{s_{X_k}}\), respectively, \(Y_i^\ast = \frac{1}{\sqrt{n-1}} \frac{Y_i-\bar{Y}}{s_Y} = \frac{Y_i-\bar{Y}}{\sqrt{\sum_i (Y_i -\bar{Y})^2}}\) where \(s_Y^2 = \frac{\sum_i (Y_i -\bar{Y})^2}{n-1}\) \(Y_{ik}^\ast = \frac{1}{\sqrt{n-1}} \frac{X_{ik}-\bar{X_k}}{s_{X_k}} = \frac{X_{ik}-\bar{X_k}}{\sqrt{\sum_i (X_{ik} -\bar{X_k})^2}}\) where \({s_{X_k}}^2 = \frac{\sum_i (X_{ik} -\bar{X_k})^2}{n-1}\)
standardized regression model \(Y_i^\ast = \beta_1^\ast X_{i1}^\ast + \cdots + \beta_{p-1}^\ast X_{i,p-1}^\ast + \epsilon_i^\ast\)
Note
- \[\beta_0^\ast = 0\]
- \[\beta_0 = \bar{Y} - \beta_1 \bar{X_1} - \cdots - \beta_{p-1} \bar{X_{p-1}}\]
- \[\beta_k = \frac{s_Y}{s_{X_k}} \beta_k^\ast, k=1,2,\cdots, p-1\]
Property of correlation matrix of \(X\) variables Let \(corr(X_i,X_j) = r_{ij}\).
\((X^\ast)^t X^\ast = r_{XX}\) \(r_{XX}b^\ast = r_{YX}\) \(\Rightarrow b^\ast = \frac{r_{YX}}{r_{XX}}\)
multicollinearity issue
Why do we need to avoid multicollinearity among variables?
When we solve normal equation \(X^tX b = X^tY\), the \(det(X^tX) \approx 0\) condition makes the system highly singular. This leads to high round-off error while computation, resulting in severe error in \(b\).
How to know if the predictor variables are correlated among themselves?
uncorrelated \(X_1\) and \(X_2\) > \(r_{12}^2=0\) > $$SSR(X_1 X_2) = SSR(X_1)\(or\)SSR(X_2 X_1) = SSR(X_2)$$ - perfectly correlated \(X_1\) and \(X_2\) > \(r_{12}^2=1\) > \((X^tX)^{-1}\) Does not exist. \(\Rightarrow\) infinitely many regression lines
- general effects of multicollinearity (\(0 < r_{12}^2 < 1\)) \(r_{12}^2 \approx 1 \Rightarrow\) increase of explanation ability is not significant.
Enjoy Reading This Article?
Here are some more articles you might like to read next: