RA ch9 Model selection and validation
Ch 9: Model selection and validation
When we have
- causing small error which can be estimated by
- simple enough to perform well with new data
This chapter focuses on introducing several criteria to choose a good linear regression model. Some could be only used(or has proved) in linear regression, while others are used in general models.
Outline
- Criteria for model selection
- coefficient of multiple determination,
- adjusted coefficient of multiple determination,
- Mallow’s
and- Prediction sum of squares,
- coefficient of multiple determination,
- Model validation
1. Criteria for model selection
Criteria for model selection below a model with
- small
- small number of parameters
.
coefficient of multiple determination,
Larger is better. It can be used in general models. drawback: always find the most complex model
adjusted coefficient of multiple determination,
adjusted version of
Mallow’s
Consider the sum of square of model error to the mean response
: random error caused by random noise (cannot be handled by the model itself.) : bias of the model
We have a criterion measure
- We are not aware of
in general. - We cannot obtain
.
Let us have the full model with
- To settle for 1, we use
( of the full model) rather than . - To resolve 2, we use the formula
. (You may prove this using the fact that .)
Therefore, . Since , .
Now we substitute
Note that when there is no bias in the model with
.
By the way,
.
Therefore, Smaller
and
Smaller is better. It is a simple criterion.
Note
Prediction sum of squares,
Smaller is better. It can be used in general models. However if the number of predictior variables are large, the computation is heavy. Therefore, we don’t go through all the candidate models, but use stepwise selection method.
stepwise selection method Stepwise selection method is a greedy approach to solve an optimization problem. At each stage, it calculates criterion function values of all the states that could be reached in one step. It chooses a path to the best state for each step.
- It doesn’t guarantee obtaining the global optimum.
- But it is the easiest approach to handle(computationally or make an algorithm to solve) optimization problems.
- It significantly reduces the computational load.
- And it gives a quite reasonable solution in a limited time.
Here, we could use forward/backward/both stepwise method which has a computational complexity of
Remark Generally, computing
2. Model validation
To find a model which performs well in prediction, we check a candidate model against independent data which is unseen, uncorrelated but from the same population. The steps are
- divide a given dataset into 2 non-overlapping sets
training data : validation data = 7:3 or 6:4 - make a model using training data
and obtain parameters - calculate the mean squared prediction error
where
Here is the jupyter notebook script to run several practice codes using R.
Enjoy Reading This Article?
Here are some more articles you might like to read next: