# Stat 3011 (Geyer) In-Class Examples (Chapter 2)

## General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions (that's already done for you). You do not have to select a dataset (that's already done for you).

## Dot Plots (Strip Charts)

### Datasets from Wild and Seber

Exercises for Section 2.3.1 (p. 49) unemploy.txt

### Comments

• The square brackets select a subset of the values of the variable `unemploy`, those for which the variable `group` has the value `"eec"`. This is documented on the R on-line help for square brackets and relational operators. To see that these are indeed the data you are to plot, look at the data file unemploy.txt.
• R does not seem to have any way to put the labels on the plot requested by this exercise.
• The on-line help for stripchart gives optional arguments. The optional argument `method="stack"` is necessary if the numbers being plotted are not all different.

## Stem-and-Leaf Plots

### Datasets from Wild and Seber

Table 2.3.1 (p. 53) trfdeath.txt

### Comments

• The additional argument `scale=3` to the `stem` function is necessary so that R will draw the same stem-and-leaf plot as in the textbook. This is documented on the R on-line help for stem.

There is no way to tell exactly what any particular value of the `scale` argument does. You just have to experiment until you get a plot that you like.

## Histograms

### Datasets from Wild and Seber

Table 2.3.2 (p. 56) coyote.txt

### Comments

• The square brackets select a subset of the values of the variable `length`, those for which the variable `gender` has the value `"female"`. This is just like the dot plot example. To see that these are indeed the data you are to plot, look at the data file coyote.txt.
• The additional argument `right=FALSE` to the `hist` function is necessary so that R will draw the same histogram as in the textbook. This is documented on the R on-line help for hist.

There is no reason to prefer the histogram you get with `right=TRUE` over the default behavior `right=FALSE`. Thus there is no reason to use this argument if you don't want to. We only used it to match the illustration in the textbook.

## Numerical Summaries

### Datasets from Wild and Seber

Table 2.1.1 (p. 39) heart.txt

### Comments

• `summary` calculates the "five number summary" (actually six numbers), described on pp. 61-69 of Wild and Seber. Some of the individual numbers in the "five number summary" are calculated by the following four lines
• `mean` calculates the mean.
• `median` calculates the median.
• There is no R function that calculates just quartiles. However, the `quantile` function calculates arbitrary quantiles (p. 244 ff. in Wild and Seber).
• `quantile(x, 0.25)` calculates the lower quartile (also called the 0.25 quantile or the 25th percentile) of the variable `x`.
• `quantile(x, 0.75)` calculates the lower quartile (also called the 0.75 quantile or the 75th percentile) of the variable `x`.
• `sd` calculates the standard deviation.
• `IQR` calculates the interquartile range (IQR).

## Data Entry

(none)

### Comments

• Note that R disagrees with the answer in the textbook about the upper quartile (technically, R is right and the textbook is wrong).
• The first command `x <- c(1, 2, 2, 5, 8, 8, 11, 12, 14)` inputs the data for this problem (we just type it in rather than read it from some file on the web). The symbol `<-` is the R assignment operator. The function `c` "collects" a bunch of numbers into one data vector.
• The `summary` command is explained in the numerical summaries example.
• Note that any variable name will do. Call it what you like (any word with any numbers or letters, upper or lower case, starting with a letter), for example `fred`.
```fred <- c(1, 2, 2, 5, 8, 8, 11, 12, 14)
summary(fred)
```
Does exactly the same thing as the example.

Note that upper and lower case are different, that is, `Fred` with a capital "F" is a different variable.

## Bar Plots (Bar Graphs)

### Datasets from Wild and Seber

Table 2.5.2 (p. 78) fishspec.txt

### Comments

• The first command `names(freq) <- nstrata` associates the names of the strata with the frequencies. The symbol `<-` is the R assignment operator.
• The `barplot` command draws the barplot. This is documented on the R on-line help for barplot.