Homework #1
I graded the initial data analysis.
- Everyone was looking at the right things!
- But, the writeups could use some improvement
- HW #3 gets you to repeat this process on a different data set
Homework #3
I’ve posted an example with some guidelines as 01-initial-data-analysis-report
, but I started from 01-initial-data-analysis-draft
.
Key things I’ll be looking for in HW #3:
- < 2 pages (notice my draft is 10 pages, but report is only 1.5 pages)
- you control what output/code is in the final version
- plots are labelled and sized appropriately
- narrative leads the reader through important findings
Today
- The F-test
- Practice with F-tests
Motivation
t-tests on individual parameters only allow us to ask a limited number of questions.
To ask questions about more than one coefficient we need something more complicted.
F-tests do this by comparing nested models. In practice, the hard part is translating a scientific question in a comparison of two models.
F-test
Let denote a larger model of interest with parameters
and a smaller model that represents some simplification of with parameters.
Intuition: If both models “fit” as well as each other, we should prefer the simpler model, . If shows substantially better fit than , that suggests the simplification is not justified.
How do we measure fit? What is substantially better fit?
F-statistic
Null hypothesis: the simplification to implied by the simpler model, .
Under the null hypothesis, the F-statistic has an F-distribution with and degrees of freedom.
Leads to tests of the form: reject for .
Deriving this fact is beyond this class (take Linear Models).
Example: Overall regression F-test
The overall regression F-test asks if any predictors are related to the response.
Full model:
Reduced model:
Null hypothesis:
All the parameters (other than the intercept) are zero.
Alternative hypothesis: At least one parameter is non-zero.
Exercise: question #1 on handout
If there is evidence against the null hypothesis:
- The null is not true, or
- the null is true but we got unlucky, or
- the full model isn’t true and the F-test is meaningless.
If there is no evidence against the null hypothesis:
- The null is true, or
- the null is false but we didn’t gather enough evidence to reject it, or
- the full model isn’t true and the F-test is meaningless.
Example: One predictor
Null hypothesis:
Equivalent to the t-test, reject null if
In fact, in this case, .
Exercise: questions #2 & #3 on handout
Other examples
- More than one parameter
- A subspace of the parameter space
Exercise: questions #4 & #5 on handout
We can’t do F-tests when
- we want to test non-linear hypotheses, e.g. (we might be able to make use of the Delta method, though)
- we want to compare non-nested models (find an example on the handout)
- the models fit use different data (most often comes up when a variable of interest has some missing values)