# Stat 5601 (Geyer) Examples (Wilcoxon Rank Sum Test and Related Procedures)

## General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.

## External Data Entry

Enter a dataset URL :

### Summary

• Lower-tailed rank sum test
• Test statistic: W = 30 (Wilcoxon form)
• Test statistic: U = 15 (Mann-Whitney form)
• Sample sizes: nx = 10, ny = 5
• P-value: P = 0.1272061

• Line one assigns the value of the parameter (population median difference) assumed under the null hypothesis. Usually zero.
• The reason for the `NA` removal in lines two and three is that Rweb insists on reading variables of the same length, so whichever of `x` or `y` is shorter must be padded with `NA` (not applicable) values.
• The test statistic `w` is the Wilcoxon form defined in equation (4.3) in Hollander and Wolfe.
• The test statistic `u` is the Mann-Whitney form defined in equation (4.15) in Hollander and Wolfe.
• In the last line we see that the R function giving the probability distribution of the test statistic under the null hypothesis uses the Mann-Whitney form. So we have to use it too, although Hollander and Wolfe use the other for most of their discussion.
• For an upper-tailed test the last line would be replaced by any of the following, which all do the same thing.
```1 - pwilcox(u - 1, nx, ny)
pwilcox(u - 1, nx, ny, lower.tail=FALSE)
pwilcox(nx * ny - u, nx, ny)
```
• For handling zeros and tied ranks, see Hollander and Wolfe, the class discussion, and below (still to be written).

## The Associated Point Estimate (Median of the Pairwise Differences)

The Hodges-Lehmann estimator associated with the rank sum test is the median of the pairwise differences, which are the nx ny differences

Yj - Xi,     for all i and j

## External Data Entry

Enter a dataset URL :

### Summary

• Point Estimate (sample median of pairwise differences): -0.305

## The Associated Confidence Interval

Very similar to the confidence intervals associated with the sign test and signed rank test, the confidence interval has the form

(D(k), D(m + 1 - k))
where m = nx ny is the number of pairwise differences, the Di are the pairwise differences, and, as always, parentheses on subscripts indicates order statistics. That is, one counts in k from each end in the list of sorted pairwise differences to find the confidence interval.

## External Data Entry

Enter a dataset URL :

### Summary

• Achieved confidence level: 96.004%
• Confidence interval for the population median difference: (-0.76, 0.15)

• Some experimentation may be needed to achieve the confidence level you want. The possible confidence levels are shown by
```1 - 2 * pwilcox(k - 1, nx, ny)
```
for different values of `k`. The vectorwise operation of R functions can give them all at once
```k <- seq(1, 100)
conf <- 1 - 2 * pwilcox(k - 1, nx, ny)
conf[conf > 1 / 2]
```
If one adds these lines to the form above, one sees that the choice is fairly restricted. There are nine possible achieved levels between 0.99 and 0.80 are
0.9873, 0.9807, 0.9720, 0.9600, 0.9447, 0.9247, 0.9008, 0.8708, 0.8355
• Alternatively, you can just assign `k` to be any integer between zero and `n / 2` just before the second to last line in the form (`cat ...`). A confidence interval with some achieved confidence level will be produced.
• For a one-tailed confidence interval (called upper and lower bounds by Hollander and Wolfe) just use `alpha` rather than `alpha / 2` in the fifth line of the form. Then make either the lower limit minus infinity or the upper limit plus infinity, as desired.

## The R Function `wilcox.test`

All of the above can be done in one shot with the R function `wilcox.test` (on-line help). This function comes with R. It was not written especially for this course.

## External Data Entry

Enter a dataset URL :

• Oops! We should have said `wilcox.test` does almost all of the above. It has a bit of programmer brain damage (PBD) in the way it calculates the point estimate.
• As we shall see when we come to ties (next section, not yet written), this is not the only PBD in `wilcox.test`.