# Randomization/Permutation testsFeb 13 2019

## Randomization Test

The F-tests (and t-tests) rely on the Normal error assumption.

In a randomized experiment, the randomization provides a basis for inference (no i.i.d sampling from populations required) and results in randomization tests.

The same procedure can be used in observational studies, with the assumption that nature ran the experiment for you, i.e. it’s like units were assigned to values of the explanatory variables at random.

Some people call randomization tests used for observational data, permutation tests.

What are the key ingredients in an hypothesis test?

## Randomization Test

(Overall F-test example)

Model: Randomized experiment

Null: Treatments have no effect on response

Test statistic: (Up to us) Let’s use overall regression F-statistic.

Null distribution: Randomization distribution of the test statistic.

## The randomized experiment model

$$n$$ experimental units $u_1 \quad u_2 \quad \ldots \quad u_n$

$X = \begin{pmatrix} 1 & x_{11} & \ldots & x_{1(p-1)} \\ \vdots & \vdots & \ddots & \vdots \\ 1 & x_{n1} & \ldots & x_{n(p-1)} \end{pmatrix}$ $y = \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}$ $$y_i$$ the observed response for the the unit that was randomly assigned to the $$i$$th row of the design matrix.

## Example: Growing tomatoes

$$n$$ experimental units

$X = \begin{pmatrix} 1 & 1 & 1 \\ 1 & 1 & 0 \\ 1 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 0 & 1 \\ 1 & 0 & 1 \end{pmatrix}$ $y = \begin{pmatrix} 8 \\ 6 \\ 2 \\ 9 \\ 7 \\ 4 \end{pmatrix}$

## Null distribution

If the null is true, treatments have no effect on response.

$y = \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}$

$$y_i$$ the observed response for the the unit that was randomly assigned to the $$i$$th row of the design matrix.

If the null is true, I see the same set of $$y_i$$, just in different order based on the output of my randomizing units to treatment.

Null distribution: the distribution of the test-statistic for all permutations of $$y_i$$

## An equally likely output of the tomato growing experiment

$$n$$ experimental units

$X = \begin{pmatrix} 1 & 1 & 1 \\ 1 & 1 & 0 \\ 1 & 1 & 0 \\ 1 & 0 & 0 \\ 1 & 0 & 1 \\ 1 & 0 & 1 \end{pmatrix}$ $y = \begin{pmatrix} 6 \\ 8 \\ 2 \\ 9 \\ 7 \\ 4 \end{pmatrix}$

## The null distribution

Observed data gives overall F-statistic: 0.132

Equally likely outcomes under the null hypothesis: $\begin{pmatrix} 8 \\ 6 \\ 2 \\ 9 \\ 7 \\ 4 \end{pmatrix}, \begin{pmatrix} 6 \\ 8 \\ 2 \\ 9 \\ 7 \\ 4 \end{pmatrix}, \begin{pmatrix} 2 \\ 6 \\ 8 \\ 9 \\ 7 \\ 4 \end{pmatrix}, \begin{pmatrix} 6 \\ 2 \\ 8 \\ 9 \\ 7 \\ 4 \end{pmatrix}, \quad \text{ +716 other possibilities}$

Equally likely F-statistics under the null hypothesis:

$0.132, \,0.244, \,5.534, \,0.244 \hspace{2in}$

## The null distribution $$624/720 = 0.87$$

## Faraway: Galapagos

In lab:

library(faraway)
lmod_small <- lm(Species ~ Nearest + Scruz,
data = gala)
lms <- summary(lmod_small)
obs_fstat <- lms$fstat nperms <- 4000 fstats <- numeric(nperms) for (i in 1:nperms){ lmods <- lm(sample(Species) ~ Nearest + Scruz, data = gala) fstats[i] <- summary(lmods)$fstat
}

## In practice

• Easiest to justify when you actually have a randomized experiment

• The choice of test statistic can be important for useful performance and interpretation:

For example, if the treatments affect the variance of the response, not the means, using the overall F-stat may fail to reject the null (treatment has no effect) with high probability even when sample sizes are large.

Sometimes it’s reasonable to add an assumption on the alternative, i.e. treatments have an additive effect.