Some details to tidy up Jan 23 2019

Summary of last week

For the linear regression model \[ Y = X\beta + \epsilon \] where \(\E{\epsilon} = 0_{n\times 1}\), \(\Var{\epsilon} = \sigma^2 I_n\), and the matrix \(X_{n \times p}\) is fixed with rank \(p\).

The least squares estimates are \[ \hat{\beta} = (X^TX)^{-1}X^TY \]

Furthermore, the least squares estimates are BLUE, and \[ \E{\hat{\beta}} = \beta, \qquad \Var{\hat{\beta}} = \sigma^2 (X^TX)^{-1} \]

We have not used any Normality assumptions to show these properties.

Today

Go over the estimation of \(\sigma\)

Strategy: Write \(e_i^2\) as a linear combination of uncorrelated variables, \(\epsilon_i\).

Write correlated residuals as combination of uncorrelated errors

Claim:

\[ ||e||^2 = \epsilon^{T}(I - H)\epsilon \]

Your turn at home:

  1. Show \((I-H)\epsilon = e\). Hint: substitute \(\epsilon = Y - X\beta\), expand and use properties of \(H\).

  2. Show \(||e||^2 = e^Te = \epsilon^T(I-H)\epsilon\). Hint: substitute in \(e = (I-H)\epsilon\) from above and use properties of \((I - H)\).

Find expected value of \(||e||^2\) in terms of \(\text{trace}(I-H)\)

Show \(\E{\epsilon^T(I-H)\epsilon} = \sigma^2 \text{trace}(I-H)\)

Hint \[ x^TAx = \sum_{i = 1}^n\sum_{j = 1}^n x_i x_j A_{ij} \] where \[ x = \left(x_1, x_2, \ldots, x_n \right)^T, \quad A = \begin{pmatrix} A_{11}& A_{12}& \ldots \\ A_{21}& A_{22}& \ldots \\ \vdots & & \end{pmatrix}_{n\times n} \]

Find expected value of \(||e||^2\) in terms of \(\text{trace}(I-H)\)

\[ \E{\epsilon^T(I-H)\epsilon} = \phantom{\hspace{3in}} \]

Find \(\text{trace}(I-H)\)

Show \[\text{trace}(I-H)=n-p\]

Hint: \[ \begin{aligned} \text{trace}(A + B) &= \text{trace}(A) + \text{trace}(B) \\ \text{trace}(AB) &= \text{trace}(BA) \end{aligned} \]

\[ \text{trace}(I-H) = \phantom{aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa} \]

Put it all together

\[ \E{\hat{\sigma}^2} = \phantom{\hspace{3in}} \]

Inference on the regression coefficients

Normality assumption

Assume \(\epsilon \sim N(0, \sigma^2 I)\).

Important reminders:

Leads to: \[ Y \sim N(\qquad, \qquad) \]

\[ \hat{\beta} \sim N(\qquad, \quad \qquad) \]

Inference on individual parameters

With the addition of the Normal assumption, it can be shown that

\[ \frac{\hat{\beta_j} - \beta_j}{SE(\hat{\beta_j})} \sim t_{n-p} \]

leads to the usual construction of tests and confidence intervals for single parameters.

Exercises

See handout.