- Find R. Either download it and install it on your own computer or find it on the Lab's computer. R software is available for free from http://streaming.stat.iastate.edu/CRAN/, for instance.
- The next hurdle is to get your data into R. So we need a data set. Let's just use the one from your first homework assignment. I copied the data and the basic story behind the data from Carnegie Mellon's StatLib DASL web site:
http://lib.stat.cmu.edu/DASL/Datafiles/DRPScores.html
Original source: Schmitt, Maribeth C., The Effects on an Elaborated Directed Reading Activity on the Metacomprehension Skills of Third Graders, Ph.D. dissertaion, Purdue University, 1987. The data is at the very bottom of this page.
- Open something like Excel and copy and paste the data into it. You might well have to use the Text to Columns function under Data in order to get the data appropriately into the right columns.
- Save the file as tab delimited text called something like HW01.txt somewhere on the computer.
- Make sure you know where HW01.txt is on your computer.
- Open R
- Change directories in R to the directory where HW01.txt lives. You can do this by going under File and Change dir... or by typing code at the prompt like:

setwd("C:/Users/Elizabeth Housworth/Desktop")

Double versus single quotes, backslashes versus forward slashes, versus double slashes are all variations that might be needed on different operating systems. And clearly you should replace my directory system names with your own. - Get your data into R and assign it to a variable by typing something like the following at the prompt:

data <- read.table(file='HW01.txt', header=TRUE)

Tricks: "true" has to be all capitals. The file has a header line if you left the line with Treatment and Response at the top of your file. If you took that out, then the header should be FALSE, which is the default value. We will learn about more complicated objects in R like data frames later. - Just type

data

to see the data and make sure that it is all there like it should be.

- The next hurdle is to write code to analyze your data and then to figure out whatever it is that R is telling you about the analysis.
- Type
help(t.test)

to read about the t.test function in R. - Since our data is in a single column, we are going to use the formula method for conducting the test. Try typing
t.test(data[,2] ~ data[,1], data)

data[,2] is the second column of data and data[,1] is the first column and the formula tells R to break up the data by the factor in the first column. - Read the output and the help page again. What kind of t-test was performed? Is the p-value reported for a one-sided or a 2-sided test? Were the variances of the two data sets assumed to be equal or not? The confidence interval is for a difference in the means. Which group came first in the difference and which group was subtracted from the first?
- Is a 2-sided test correct for this problem? Think about the set up and the question. Perform a 1-sided t-test by typing either
t.test(data[,2] ~ data[,1], data, alternative="greater")

ort.test(data[,2] ~ data[,1], data, alternative="less")

Which one gives you the test and confidence interval you want? You can figure out the order R puts the factors in by typingas.factor(data[,1])

- How would you get R to assume equal variances? Read the t.test help page to find out and try it out.

- Type
- There are a lot of basic statistical functions that you might want to use on your data. The manual for R is available at
http://cran.r-project.org/doc/manuals/R-intro.html
For most of them, you would want to separate out the treatment from the control group. You can do that here with the commands
treatment <- subset(data[,2], data[,1]=="Treated")

andcontrol <-subset(data[,2], data[,1]=="Control")

From this you can do things like find the means and standard deviations by typing things likemean(treatment) sd(treatment)

- You can also plot the two datasets side by side in various ways. A simple side by side box plot can be obtained by typing
boxplot(data[,2]~data[,1], data)

To play with the graph, look at the help page for the boxplot by typinghelp(boxplot)

You can play with little things like changing the labels using "names" by typingboxplot(data[,2]~data[,1], data, names=c("Control", "Treatment"))

You can add titles after the fact by just typingtitle(main="Effect of Activites on Reading Outcomes", sub="Treatment versus Control")

Read about all of the things you can change by reading the help pages:help(boxplot) help(bxp) help(par)

Try to change the color of something - like the boxes or the labels.

Treatment Response Treated 24 Treated 43 Treated 58 Treated 71 Treated 43 Treated 49 Treated 61 Treated 44 Treated 67 Treated 49 Treated 53 Treated 56 Treated 59 Treated 52 Treated 62 Treated 54 Treated 57 Treated 33 Treated 46 Treated 43 Treated 57 Control 42 Control 43 Control 55 Control 26 Control 62 Control 37 Control 33 Control 41 Control 19 Control 54 Control 20 Control 85 Control 46 Control 10 Control 17 Control 60 Control 53 Control 42 Control 37 Control 42 Control 55 Control 28 Control 48