Kickstarting R - T tests

Caveats first. Using appropriate statistics is not always easy. Please do not blame me when a reviewer caustically refers to the 56 uncorrected t-tests that you performed on your data as the work of a moron. The techniques explained here will probably be adequate for univariate experiments and confirmatory tests of group equality. This is not a statistics book.

## Tests of between-group means

Again using the `infert` data set and the `brkdn()` function, let's look at the means of age for cases and non-cases.

```> brkdn(age~case,infert)
0        1
Mean      31.49091 31.53012
Variance  27.60510 27.86189
n        165.00000 83.00000
attr(,"class")
 "dstat"```

It looks as though the two groups have been age-matched. Try a t-test to see if there is a difference.

```> t.test(subset(infert\$age,infert\$case == 0),
+ subset(infert\$age,infert\$case == 1))

Welch Two Sample t-test

data:  subset(infert\$age, infert\$case == 0) and
subset(infert\$age, infert\$case == 1)
t = -0.0553, df = 163.766, p-value = 0.956
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.439600  1.361177
sample estimates:
mean of x mean of y
31.49091  31.53012```

Bronwyn wants to know if those women in the sample who completed high school were significantly younger than those who did not.

```> brkdn(age~education,infert)
0-5yrs   6-11yrs   12+ yrs
Mean     35.25000  32.85000  29.72414
Variance 40.02273  28.66639  19.19280
n        12.00000 120.00000 116.00000
attr(,"class")
 "dstat"```

She may have a case here, but first let's do something about that painful typing in of every subsetting operation. Have a look at the function `group.t.test()`. By calling this function as follows:

`> group.t.test(infert\$age,infert\$education,"12+yrs")`

we can specify a grouping factor of high school completion versus everything else. This function also allows us to test two specified groups against one another.

Before we leave `t.test()`, the ellipsis (...) at the end of the arguments means that you can pass additional arguments to `t.test()`. For example, you might want the 99% confidence interval displayed rather than the default 95% one.

```        Welch Two Sample t-test

data:  age by as.factor(ifelse(infert\$education != "12+ yrs", "<12 yrs", "12+ yrs"))
t = 5.3423, df = 243.997, p-value = 2.103e-07
alternative hypothesis: true difference in means is not equal to 0
99 percent confidence interval:
1.718973 4.969115
sample estimates:
mean in group <12 yrs mean in group 12+ yrs
33.06818              29.72414
```

Looks like Bronwyn was right.

For a much more detailed treatment of ANOVAs and other methods, get the VR package or Notes on the use of R..." in the Contributed documentation page.