ST552 Lab 2 Jan 13th 2016

- Reproducible reports: Rmarkdown
- Matrix algebra in R
- Simulation of random variables in R

Last week we talked about reproducible code, today we’ll talk about reproducible reports. A reproducible report is a document that records a complete analysis so that it can be reproduced exactly (and automatically) at any point in the future.

For homeworks we will be able to record everything we need in a single document. In reality for more complicated projects, a reproducible analysis will probably be a directory of data, reproducible code files and report generating files, that ideally will be under version control.

We’ll use Rmarkdown to generate our reports. Rmarkdown files combine markdown (a kind of plain text markup language) with chunks of R code. When compiled, the R code in the file is evaluated, then the results are woven into the markdown. Markdown is flexible enough that it can then be turned into a pdf (via LaTeX), a Word document, or an html file (for hosting on the web for instance, like this lab!).

Let’s learn by example. Matt will walk you through this part:

- Open a new Rmarkdown file in RStudio (New File -> R Markdown).
- In the dialog that opens put in your own title and select Word document.

The file that opens has a template to help you figure out how Rmarkdown works. Hit the ‘Knit Word’ button and watch what happens. Compare the Word document that opens to the contents of the .Rmd file. In particular, notice the R code chunks

```
```{r}
# R CODE HERE
```
```

in the .Rmd file are run, “echoed” and the results (output or plots) are included in the document.

Try adding the following line to the document:

`The average speed is `r mean(cars$speed) `. `

Then try this chunk:

```
$$
\bar{x} = \frac{1}{n}\sum_{i = 1}^n x_i
$$
```

I’d actually recommend Knitting to pdf, but you will need to install LaTeX on the computer you use.

Head to http://www.statmethods.net/advstats/matrix.html to see a list of all the matrix functions you’ll need to complete your homework this week.

To **practice** create the following matrices with as little typing as possible:

\[
I_{10 \times 10}
\]

\[ D = \left[ \begin{matrix} 1 & 0 & 0 & \ldots & 0\\ 0 & 2 & 0 & \ldots & 0\\ 0 & 0 & 3 & \ldots & 0\\ \vdots & \vdots & \vdots & \ddots & 0\\ 0 & 0 & 0 & \ldots & 10 \end{matrix}\right] \] \[ O = \pmb{1}_{10 \times 10} \quad (\text{a } 10 \times 10 \text{ matrix full of ones}) \] \[ X = \left[ \begin{matrix} 1 & 1\\ 1 & 2 \\ 1 & 3 \\ \vdots & \vdots \\ 1 & 10 \end{matrix}\right] \quad \]

Then calculate: \[ X^T, \quad D^{-1}, \text{and } X^TX \]

You probably already know this, but for completeness, to simulate a realization of \(n\) independent Normal random variables with mean 0 and standard deviation 1 in R:

```
n <- 10 # for example
rnorm(n)
```

```
## [1] 1.178538998 0.472493981 -0.522909774 0.206437481 0.003171414
## [6] -0.493388456 -1.067536904 -1.343552370 -0.915417901 -0.821686809
```

`rnorm`

has arguments `mean`

and `sd`

if you need a different mean and standard deviation. Want dependence? Start with uncorrelated observations, and transform them (check the first answer) or use the function `rmvnorm`

in the `mvtnorm`

package.