Stat 3011 (Geyer) In-Class Examples (Chapter 2)

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions (that's already done for you). You do not have to select a dataset (that's already done for you).

Dot Plots (Strip Charts)

Datasets from Wild and Seber

Exercises for Section 2.3.1 (p. 49) unemploy.txt

• The square brackets select a subset of the values of the variable `unemploy`, those for which the variable `group` has the value `"eec"`. This is documented on the R on-line help for square brackets and relational operators. To see that these are indeed the data you are to plot, look at the data file unemploy.txt.
• R does not seem to have any way to put the labels on the plot requested by this exercise.
• The on-line help for stripchart gives optional arguments. The optional argument `method="stack"` is necessary if the numbers being plotted are not all different.

Stem-and-Leaf Plots

Datasets from Wild and Seber

Table 2.3.1 (p. 53) trfdeath.txt

• The additional argument `scale=3` to the `stem` function is necessary so that R will draw the same stem-and-leaf plot as in the textbook. This is documented on the R on-line help for stem.

There is no way to tell exactly what any particular value of the `scale` argument does. You just have to experiment until you get a plot that you like.

Histograms

Datasets from Wild and Seber

Table 2.3.2 (p. 56) coyote.txt

• The square brackets select a subset of the values of the variable `length`, those for which the variable `gender` has the value `"female"`. This is just like the dot plot example. To see that these are indeed the data you are to plot, look at the data file coyote.txt.
• The additional argument `right=FALSE` to the `hist` function is necessary so that R will draw the same histogram as in the textbook. This is documented on the R on-line help for hist.

There is no reason to prefer the histogram you get with `right=TRUE` over the default behavior `right=FALSE`. Thus there is no reason to use this argument if you don't want to. We only used it to match the illustration in the textbook.

Numerical Summaries

Datasets from Wild and Seber

Table 2.1.1 (p. 39) heart.txt

• `summary` calculates the "five number summary" (actually six numbers), described on pp. 61-69 of Wild and Seber. Some of the individual numbers in the "five number summary" are calculated by the following four lines
• `mean` calculates the mean.
• `median` calculates the median.
• There is no R function that calculates just quartiles. However, the `quantile` function calculates arbitrary quantiles (p. 244 ff. in Wild and Seber).
• `quantile(x, 0.25)` calculates the lower quartile (also called the 0.25 quantile or the 25th percentile) of the variable `x`.
• `quantile(x, 0.75)` calculates the lower quartile (also called the 0.75 quantile or the 75th percentile) of the variable `x`.
• `sd` calculates the standard deviation.
• `IQR` calculates the interquartile range (IQR).

Data Entry

Datasets from Wild and Seber

(none)

• Note that R disagrees with the answer in the textbook about the upper quartile (technically, R is right and the textbook is wrong).
• The first command `x <- c(1, 2, 2, 5, 8, 8, 11, 12, 14)` inputs the data for this problem (we just type it in rather than read it from some file on the web). The symbol `<-` is the R assignment operator. The function `c` "collects" a bunch of numbers into one data vector.
• The `summary` command is explained in the numerical summaries example.
• Note that any variable name will do. Call it what you like (any word with any numbers or letters, upper or lower case, starting with a letter), for example `fred`.
```fred <- c(1, 2, 2, 5, 8, 8, 11, 12, 14)
summary(fred)
```
Does exactly the same thing as the example.

Note that upper and lower case are different, that is, `Fred` with a capital "F" is a different variable.

Bar Plots (Bar Graphs)

Datasets from Wild and Seber

Table 2.5.2 (p. 78) fishspec.txt

• The first command `names(freq) <- nstrata` associates the names of the strata with the frequencies. The symbol `<-` is the R assignment operator.
• The `barplot` command draws the barplot. This is documented on the R on-line help for barplot.