Stat 3011 (Geyer) In-Class Examples (Chapter 10)

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions (that's already done for you). You do not have to select a dataset (that's already done for you).

One-Sample Tests for Means

Student's t test (Example 10.1.1 in Wild and Seber)

To do a two-tailed test of the null hypothesis that specifies the population mean to be mu using a random sample that is in an R dataset named fred you just say t.test(fred, mu=mu).

For an example we will use the data for Example 10.1.1 in Wild and Seber, which is in the file nitrate.txt. The null hypothesis to be tested is mu = 0.492, which is said to be the "desired concentration" in the example.

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> t.test(concentration, mu=0.492)

One Sample t-test

data:  concentration
t = 2.391, df = 9, p-value = 0.04049
alternative hypothesis: true mean is not equal to 0.492
95 percent confidence interval:
0.4926521 0.5155479
sample estimates:
mean of x
0.5041
From the printout we get the result of the test: P = 0.0405 (two-tailed).

This follows the Right Way (with a capital R and capital W) of reporting results of hypothesis tests.

Always report the P-value and whether the test was one-tailed or two-tailed.
This allows the knowledgeable reader to make her own decision about what the test says and make the appropriate correction if she thinks the author should have done a two-tailed test instead of one-tailed (or vice versa).

Changing the Alternative Hypothesis (Example 10.1.2 in Wild and Seber)

For a different alternative hypothesis use the alternative optional argument to the t.test function. This is documented on the R on-line help for t.test. The possible values are "two.sided" (the default), "greater" or "less".

To do a two-tailed test (which goes with the two-sided alternative), you do not need to specify the alternative optional argument, because two-sided is the default. To do a one-tailed test, you need to specify the appropriate alternative.

For an example we will use the data for Example 10.1.2 in Wild and Seber, which is in the file moon1.txt. The null hypothesis to be tested is mu = 1.0, which says the apparent diameters were the same in both situations (so the difference would be zero, but the ratio, which is what the experimenters used, would be one).

The alternative hypothesis to be tested is mu > 1.0, which says the "moon illusion" actually exists.

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> t.test(ratio, mu=1.0, alternative="greater")

One Sample t-test

data:  ratio
t = 4.0727, df = 9, p-value = 0.001394
alternative hypothesis: true mean is greater than 1
95 percent confidence interval:
1.265055       NA
sample estimates:
mean of x
1.482
From the printout we get the result of the test: P = 0.0014 (one-tailed).

Quick Quiz: Suppose you think the authors should have done a two-tailed test. What should the P-value be? Answer without rerunning t.test.

Computer-Aided Hand Calculation

Note: In order to use the t.test function, you must have the whole data set. There is no version of the function that produces an interval from just the sample mean and standard deviation.

So if you are given just the sample mean, sample standard deviation, and sample size and told to perform a test of significance, you will have to do it by hand like we did in Chapter 9. You can't use the computer, except as a fancy calculator. We call this computer-aided hand calculation.

Question: The sample of 10 ratios of diameters in Example 10.1.2 in Wild and Seber had sample mean 1.482 and sample s. d. 0.37425. Do the same test as in the preceding section.

We do indeed get the same result a before: P = 0.0014 (one-tailed).

Quick Quiz: How would you change the last command to do

• a two-tailed test?
• a lower-tailed test?

Paired Comparisons (Section 10.1.2 in Wild and Seber)

As explained in Wild and Seber (Section 10.1.2) there is hardly any difference between a one-sample test (or confidence interval) and a "paired comparison" test (or confidence interval).
For paired data, analyze the differences

For an example we will use the data for Example 10.1.2 (cont.) on p. 419 in Wild and Seber, which is in the file moon.txt, which has two variables elevated and level. In the other analysis of these data, the ratio of these two variables was used as a single variable to analyze. Here we do the usual "paired comparison" thing.

The null hypothesis to be tested is mu = 0, where mu is the mean difference, which says the two viewing situations produce the same results.

The alternative hypothesis to be tested is mu > 0, which says the "moon illusion" actually exists.

Either of the following do exactly the same thing. The fancy way is to supply the optional argument paired=TRUE. The simple way is just to supply the differences elevated - level. There is no change whatsoever in the numbers. Only the surrounding words are a bit different.

Quantile-Quantile (QQ) Plots (Example 10.1.1 in Wild and Seber)

The most useful plot available in R for getting some idea how close a the distribution of a sample is to normal is a quantile-quantile (QQ) plot. Similar, though not exactly identical plots are available in other statistical computer software. The "normal plot" discussed by Wild and Seber is very similar though slightly different.

The Plot

For an example we will use the data for Example 10.1.1 in Wild and Seber, which is in the file nitrate.txt.

If the data are normally distributed, they lie near a straight line. Of course, they do not lie exactly on a straight line because of sampling variability (the sample is not the population).

What QQ Plots Should Look Like

To see how much variation should be expected, compare with QQ plots of simulated data known to be normal.

Since in this case, the plot for the real data does not look any less linear than the plots for the simulated data, we conclude that there is no evidence of non-normality in the data (which doesn't prove it is normal -- absence of evidence is not evidence of absence).

What QQ Plots Shouldn't Look Like

Heavy Tails

As an example of a heavy tailed distribution, consider Student's t distribution with five degrees of freedom

Hmmmm. If you try this several times, you will sometimes see heavy tails, sometimes not. It is very hard to detect non-normality with sample size as small as 10.

Change n to 50 and retry. The heavy tailedness is usually fairly clear at n = 50. But if you want to see what an ideal QQ plot of a heavy tailed distribution looks like, try n = 1000.

Skewness

As an example of a heavy tailed distribution, consider the exponential distribution used as a counterexample for the central limit theorem.

Hmmmm. If you try this several times, you will sometimes see one heavy tail and one light tail, and sometimes not. It is very hard to detect non-normality with sample size as small as 10.

Change n to 50 and retry. The skewness is clear at n = 50, because this is a very skewed distribution.

Two-Sample Tests for Means (Section 10.2 in Wild and Seber)

Two Independent Samples (Example 10.2.1 in Wild and Seber)

To do a two-tailed test of the null hypothesis that specifies the difference of two population means to be zero using two independent random samples from the two populations that are in R datasets named fred and sally you just say t.test(fred, sally).

For an example we will use the data for Example 10.2.1 in Wild and Seber, which is in the file urinary.txt. The null hypothesis to be tested is zero difference between the population means (which is the default, hence need not be specified).

Looking at the data file urinary.txt, we see that the data are not in the form t.test wants (two vectors) so we first have to do the job of forming the two vectors.

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> t.test(het, hom)

Welch Two Sample t-test

data:  het and hom
t = 3.1572, df = 23.862, p-value = 0.004278
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.3523851 1.6839786
sample estimates:
mean of x mean of y
3.518182  2.500000
From the printout we get the result of the test: P = 0.0043 (two-tailed).

More Than Two Sample Tests for Means (Section 10.3 in Wild and Seber)

One-Way ANOVA

This section shows how to do a test of the null hypothesis that specifies the means of several (more than two) populations to be equal. There is no notion of one-tailed or two-tailed tests here. The alternative is not all means equal (at least two of the means are unequal).

The data are independent random samples from the populations. For the convenience of R, the dataset should consist of two vectors, one, say it is called fred consists of all the measurements for all the samples, and the other, say it is called sally is a categorical variable that indicates which sample the corresponding measurement belongs to.

The function that does this test is called aov and is documented on the R on-line help. To use the function you just say aov(fred ~ sally). Note that there is just one argument (no commas inside the parens). The thingy inside the parens with the wiggle is an R formula specifying a "model" to fit -- here a one-way anova model.

For an example we will use the data for Example 10.3.1 in Wild and Seber, which is in the file reading.txt.

Looking at the data file reading.txt, we see that the data are in the form aov wants (two vectors, one containing the numerical measurements and the other the treatment indicator).

Since the output is rather complicated, we copy it below with the part about the test of significance highlighted

Rweb:> out <- aov(increase ~ method)
Rweb:> summary(out)
Df Sum Sq Mean Sq F value   Pr(>F)
method       3 27.062   9.021  4.445 0.007977 **
Residuals   46 93.351   2.029
---
Signif. codes:  0  `***'  0.001  `**'  0.01  `*'  0.05  `.'  0.1  ` '  1
From the printout we get the result of the test: P = 0.008. It is clear there is some difference among the means.

Actually the "Signif codes" are rather stupid. We should be able to figure out for ourselves how big the P-value is and what it means. The R command

options(show.signif.stars=FALSE)
turns them off. So

produces a cleaner output.

Diagnostic Plots for One-Way ANOVA

There is no point in plotting the raw data for one-way ANOVA. Under the alternative model, the samples are supposed to be from different populations so it is no big deal when we find out they actually are different.

More precisely, the ANOVA model assumes the populations are

• normally distributed, with
• different means, but
• the same standard deviation.
Somehow we want to check these assumptions. If we subtract off the population means from the data values, then they are all normal with the same mean (zero) and the same standard deviation. Of course, we don't know the population means, so we can subtract them off. The next best thing is to subtract off the sample means. These are called the residuals.

QQ Plot of the Residuals

If the result of an analysis of variance has been saved in a variable out, then residuals(out) gives the residuals. For our example (Example 10.3.1 in Wild and Seber, data file reading.txt)

does the QQ plot. It is interpreted just like any other QQ plot (as described in the section on QQ plots).

Plot of Residuals versus Category

Another widely used diagnostic plot has residuals on the vertical axis and category number on the horizontal axis.

For our example (Example 10.3.1 in Wild and Seber, data file reading.txt)

makes the plot.

In this plot we are looking for each group (sample) to have mean zero (but that is automatically true just because we have subtracted the group mean from each observation) and

• the same standard deviation and
• normal distribution.
These we check the best we can by looking at the plot. Do the groups have approximately the same spread? That checks for same standard deviation. Do the groups have obvious outliers? That checks, at least a bit, about normality. The QQ plot is a much more sensitive detector of non-normality.

Plot of Residuals versus Fitted Values

A plot very similar to the last has has residuals on the vertical axis and group means (also called in this context effects or fitted values or predicted values) on the horizontal axis. If the result of an analysis of variance has been saved in a variable out, then residuals(out) gives the residuals and fitted(out) gives the fitted values.

For our example (Example 10.3.1 in Wild and Seber, data file reading.txt)

makes the plot.

We are looking for exactly the same things here as in the plot of residuals versus category number. The only reason this plot is sometimes slightly better is that sometimes standard deviation increases proportional to the mean (fitted value), an this is easier to see when we plot the groups over their means.