Today
- Matrix warmup
- Multiple Linear Regression
- Matrix setup
Matrix warmup
See handout
Simple linear regression
Recall in simple linear regression:
Have observations of a response , and a single explanatory variable, .
The response is related to the explanatory variable by:
where are independent and identically distributed with expected value 0, and variance .
Multiple linear regression
Now we have more than one explanatory variable.
Have observations of a response, and a set of explanatory variables, .
The response is related to the explanatory variables by:
where are independent and identically distributed with expected value 0, and variance .
Example: Galápagos Islands
Faraway 2.6
Measurements on 30 Galápagos Islands are made.
First 5 islands:
Species | Area | Elevation | Nearest | Scruz | Adjacent | |
---|---|---|---|---|---|---|
Baltra | 58 | 25.09 | 346 | 0.6 | 0.6 | 1.84 |
Bartolome | 31 | 1.24 | 109 | 0.6 | 26.3 | 572.3 |
Caldwell | 3 | 0.21 | 114 | 2.8 | 58.7 | 0.78 |
Champion | 25 | 0.1 | 46 | 1.9 | 47.4 | 0.18 |
Coamano | 2 | 0.05 | 77 | 1.9 | 1.9 | 903.8 |
Variable Descriptions
?gala
gala | R Documentation |
Species diversity on the Galapagos Islands
Format
The dataset contains the following variables
-
Species
-
the number of plant species found on the island
-
Endemics
-
the number of endemic species
-
Area
-
the area of the island (km)
-
Elevation
-
the highest elevation of the island (m)
-
Nearest
-
the distance from the nearest island (km)
-
Scruz
-
the distance from Santa Cruz island (km)
-
Adjacent
-
the area of the adjacent island (square km)
A possible model
E.g. , Baltra:
Your turn:
- What does index?
- What is the value of ?
- What is the value of ?
General matrix form
where
Galápagos: Matrix form
Your Turn
Write out the design matrix, , for the following models, using the data for the first five islands:
where is an indicator variable that takes the value 1, when the condition in the argument is true, and 0 otherwise.
Fitted values and residuals
If we had an estimate for the vector,
Then we can define fitted value and residual vectors:
Questions to answer this week:
- How will we find ?
- What properties do the estimates have?