University of Minnesota, Twin Cities School of Statistics Stat 3011 Rweb Textbook (Wild and Seber)

- General Instructions
- One-Sample Tests for Means
- Quantile-Quantile (QQ) Plots
- Two-Sample Tests for Means
- Paired Comparisons (see under one-sample tests)
- Two Independent Samples (
`t.test`

)

- More Than Two Sample Tests for Means

To do a two-tailed test of the null hypothesis that specifies
the population mean to be `mu`

using a random sample that is in an R dataset named `fred`

you just say `t.test(fred, mu=mu)`

.

For an example we will use the data for Example 10.1.1 in Wild and Seber,
which is in the file
nitrate.txt.
The null hypothesis to be tested is `mu = 0.492`

, which is said
to be the "desired concentration" in the example.

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> t.test(concentration, mu=0.492) One Sample t-test data: concentration t = 2.391, df = 9, p-value = 0.04049 alternative hypothesis: true mean is not equal to 0.492 95 percent confidence interval: 0.4926521 0.5155479 sample estimates: mean of x 0.5041From the printout we get the result of the test:

This follows the Right Way (with a capital R and capital W) of reporting results of hypothesis tests.

This allows the knowledgeable reader to make her own decision about what the test says and make the appropriate correction if she thinks the author should have done a two-tailed test instead of one-tailed (or vice versa).Always report the P-value and whether the test was one-tailed or two-tailed.

For a different alternative hypothesis use the `alternative`

optional argument to the `t.test`

function.
This is documented on the R on-line
help for
t.test.
The possible values are "`two.sided`

" (the default),
"`greater`

" or "`less`

".

To do a two-tailed test (which goes with the two-sided alternative), you
do not need to specify the `alternative`

optional argument,
because two-sided is the default. To do a one-tailed test, you need to
specify the appropriate alternative.

For an example we will use the data for Example 10.1.2 in Wild and Seber,
which is in the file
moon1.txt.
The null hypothesis to be tested is `mu = 1.0`

, which says the
apparent diameters were the same in both situations (so the *difference*
would be zero, but the *ratio*, which is what the experimenters used,
would be one).

The alternative hypothesis to be tested is `mu > 1.0`

, which
says the "moon illusion" actually exists.

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> t.test(ratio, mu=1.0, alternative="greater") One Sample t-test data: ratio t = 4.0727, df = 9, p-value = 0.001394 alternative hypothesis: true mean is greater than 1 95 percent confidence interval: 1.265055 NA sample estimates: mean of x 1.482From the printout we get the result of the test:

**Quick Quiz:** Suppose you think the authors should have done
a two-tailed test. What should the *P*-value be? Answer without
rerunning `t.test`

.

**Note:** In order to use the `t.test`

function,
you must have the *whole data set*. There is no version of the
function that produces an interval from just the sample mean and standard
deviation.

So if you are given just the sample mean, sample standard deviation, and
sample size and told to perform a test of significance, you will
*have to do it by hand* like we did in Chapter 9. You can't
use the computer, except as a fancy calculator.
We call this *computer-aided hand calculation*.

**Question:** The sample of 10 ratios of diameters in
Example 10.1.2 in Wild and Seber had sample mean 1.482 and
sample s. d. 0.37425. Do the same test as in the
preceding section.

**Answer:**

We do indeed get the same result a before:
*P* = 0.0014 (one-tailed).

**Quick Quiz:** How would you change the last command to do

- a two-tailed test?
- a lower-tailed test?

Fordata,pairedanalyze the differences

For an example we will use the data for Example 10.1.2 (cont.) on
p. 419 in Wild and Seber,
which is in the file
moon.txt,
which has two variables `elevated`

and `level`

.
In the other analysis of these data, the
*ratio* of these two variables was used as a single variable to
analyze. Here we do the usual "paired comparison" thing.

The null hypothesis to be tested is `mu = 0`

, where `mu`

is the *mean difference*, which says
the two viewing situations produce the same results.

The alternative hypothesis to be tested is `mu > 0`

, which
says the "moon illusion" actually exists.

Either of the following do exactly the same thing. The fancy way is
to supply the optional argument `paired=TRUE`

. The simple
way is just to supply the differences `elevated - level`

.
There is no change whatsoever in the numbers. Only the surrounding words
are a bit different.

The most useful plot available in R for getting some idea how close a
the distribution of a sample is to normal is
a *quantile-quantile (QQ) plot*. Similar, though not exactly
identical plots are available in other statistical computer software.
The "normal plot" discussed by Wild and Seber is very similar though
slightly different.

For an example we will use the data for Example 10.1.1 in Wild and Seber, which is in the file nitrate.txt.

If the data are normally distributed, they lie near a straight line. Of course, they do not lie exactly on a straight line because of sampling variability (the sample is not the population).

To see how much variation should be expected, compare with QQ plots of simulated data known to be normal.

Since in this case, the plot for the real data does not look any less linear than the plots for the simulated data, we conclude that there is no evidence of non-normality in the data (which doesn't prove it is normal -- absence of evidence is not evidence of absence).

As an example of a heavy tailed distribution, consider Student's
*t* distribution with five degrees of freedom

Hmmmm. If you try this several times, you will sometimes see heavy tails,
sometimes not. It is *very hard* to detect non-normality with
sample size as small as 10.

Change `n`

to 50 and retry. The heavy tailedness is usually
fairly clear at `n`

= 50. But if you want to see what
an ideal QQ plot of a heavy tailed distribution looks like,
try `n`

= 1000.

As an example of a heavy tailed distribution, consider the exponential distribution used as a counterexample for the central limit theorem.

Hmmmm. If you try this several times, you will sometimes see one heavy tail
and one light tail, and sometimes not. It is *very hard* to detect
non-normality with sample size as small as 10.

Change `n`

to 50 and retry. The skewness is clear
at `n`

= 50, because this is a *very* skewed distribution.

To do a two-tailed test of the null hypothesis that specifies
the difference of two population means to be zero
using two *independent* random samples from the two populations
that are in R datasets named `fred`

and `sally`

you just say `t.test(fred, sally)`

.

For an example we will use the data for Example 10.2.1 in Wild and Seber, which is in the file urinary.txt. The null hypothesis to be tested is zero difference between the population means (which is the default, hence need not be specified).

Looking at the data file
urinary.txt,
we see that the data are not in the form `t.test`

wants (two vectors)
so we first have to do the job of forming the two vectors.

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> t.test(het, hom) Welch Two Sample t-test data: het and hom t = 3.1572, df = 23.862, p-value = 0.004278 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 0.3523851 1.6839786 sample estimates: mean of x mean of y 3.518182 2.500000From the printout we get the result of the test:

This section shows how to do a test of the null hypothesis that specifies the means of several (more than two) populations to be equal. There is no notion of one-tailed or two-tailed tests here. The alternative is not all means equal (at least two of the means are unequal).

The data are *independent* random samples from the populations.
For the convenience of R, the dataset should consist of two vectors,
one, say it is called `fred`

consists of all the measurements
for all the samples, and the other, say it is called `sally`

is a categorical variable that indicates which sample the corresponding
measurement belongs to.

The function that does this test is called `aov`

and is
documented on the R
on-line help.
To use the function you just say `aov(fred ~ sally)`

.
Note that there is just one argument (no commas inside the parens).
The thingy inside the parens with the wiggle is an R formula specifying
a "model" to fit -- here a one-way anova model.

For an example we will use the data for Example 10.3.1 in Wild and Seber, which is in the file reading.txt.

Looking at the data file
reading.txt,
we see that the data are in the form `aov`

wants (two vectors, one
containing the numerical measurements and the other the treatment indicator).

```
Rweb:> out <- aov(increase ~ method)
Rweb:> summary(out)
Df Sum Sq Mean Sq F value Pr(>F)
method 3 27.062 9.021 4.445 0.007977 **
Residuals 46 93.351 2.029
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
```

From the printout we get the result of the test:
Actually the "Signif codes" are rather stupid. We should be able to figure
out for ourselves how big the *P*-value is and what it means.
The R command

options(show.signif.stars=FALSE)turns them off. So produces a cleaner output.

There is no point in plotting the raw data for one-way ANOVA. Under the alternative model, the samples are supposed to be from different populations so it is no big deal when we find out they actually are different.

More precisely, the ANOVA model assumes the populations are

- normally distributed, with
*different*means, but*the same*standard deviation.

If the result of an analysis of variance has been saved in a variable
`out`

, then `residuals(out)`

gives the residuals.
For our example (Example 10.3.1 in Wild and Seber, data file
reading.txt)

Another widely used diagnostic plot has residuals on the vertical axis and category number on the horizontal axis.

For our example (Example 10.3.1 in Wild and Seber, data file reading.txt)

makes the plot.In this plot we are looking for each group (sample) to have mean zero (but that is automatically true just because we have subtracted the group mean from each observation) and

- the same standard deviation and
- normal distribution.

A plot very similar to the last has
has residuals on the vertical axis
and group means (also called in this context *effects*
or *fitted values* or *predicted values*) on
the horizontal axis.
If the result of an analysis of variance has been saved in a variable
`out`

,
then `residuals(out)`

gives the residuals
and `fitted(out)`

gives the fitted values.

For our example (Example 10.3.1 in Wild and Seber, data file reading.txt)

makes the plot.We are looking for exactly the same things here as in the plot of residuals versus category number. The only reason this plot is sometimes slightly better is that sometimes standard deviation increases proportional to the mean (fitted value), an this is easier to see when we plot the groups over their means.