University of Minnesota, Twin Cities School of Statistics Stat 5601 Rweb

- General Instructions
- The Wilcoxon Rank Sum Test
- The Associated Point Estimate (Median of the Pairwise Differences)
- The Associated Confidence Interval
- The R Function
`wilcox.test`

- Lower-tailed rank sum test
- Test statistic:
`W`= 30 (Wilcoxon form) - Test statistic:
`U`= 15 (Mann-Whitney form) - Sample sizes:
`n`= 10,_{x}`n`= 5_{y} - P-value: P = 0.1272061

- Line one assigns the value of the parameter (population median difference) assumed under the null hypothesis. Usually zero.
- The reason for the
`NA`

removal in lines two and three is that Rweb insists on reading variables of the same length, so whichever of`x`

or`y`

is shorter must be padded with`NA`

(not applicable) values. - The test statistic
`w`

is the Wilcoxon form defined in equation (4.3) in Hollander and Wolfe. - The test statistic
`u`

is the Mann-Whitney form defined in equation (4.15) in Hollander and Wolfe. - In the last line we see that the R function giving the probability distribution of the test statistic under the null hypothesis uses the Mann-Whitney form. So we have to use it too, although Hollander and Wolfe use the other for most of their discussion.
- For an
**upper-tailed test**the last line would be replaced by any of the following, which all do the same thing.1 - pwilcox(u - 1, nx, ny) pwilcox(u - 1, nx, ny, lower.tail=FALSE) pwilcox(nx * ny - u, nx, ny)

- For handling zeros and tied ranks, see Hollander and Wolfe, the class discussion, and below (still to be written).

The Hodges-Lehmann estimator associated with the rank sum test
is the median of the pairwise differences, which are the
`n _{x}`

Y-_{j}X, for all_{i}iandj

- Point Estimate (sample median of pairwise differences): -0.305

Very similar to the confidence intervals associated with the sign test and signed rank test, the confidence interval has the form

(whereD_{(k)},D_{(m + 1 - k)})

- Achieved confidence level: 96.004%
- Confidence interval for the population median difference: (-0.76, 0.15)

- Some experimentation may be needed to achieve the confidence level
you want. The possible confidence levels are shown by
1 - 2 * pwilcox(k - 1, nx, ny)

for different values of`k`

. The vectorwise operation of R functions can give them all at oncek <- seq(1, 100) conf <- 1 - 2 * pwilcox(k - 1, nx, ny) conf[conf > 1 / 2]

If one adds these lines to the form above, one sees that the choice is fairly restricted. There are nine possible achieved levels between 0.99 and 0.80 are0.9873, 0.9807, 0.9720, 0.9600, 0.9447, 0.9247, 0.9008, 0.8708, 0.8355

- Alternatively, you can just assign
`k`

to be any integer between zero and`n / 2`

just before the second to last line in the form (`cat ...`

). A confidence interval with*some*achieved confidence level will be produced. - For a one-tailed confidence interval (called upper and lower bounds by
Hollander and Wolfe) just use
`alpha`

rather than`alpha / 2`

in the fifth line of the form. Then make either the lower limit minus infinity or the upper limit plus infinity, as desired.

`wilcox.test`

All of the above can be done in one shot with the R function
`wilcox.test`

(on-line help).
This function comes with R. It was not written especially for this course.

- Oops! We should have said
`wilcox.test`

does*almost*all of the above. It has a bit of programmer brain damage (PBD) in the way it calculates the point estimate.It uses for its definition of the median, the average of the two middle values if an even number of values (which is the standard definition) and the average of the two values on either side of the middle value if an odd number (which I have never seen anywhere else).

Of course, this definition is asymptotically equivalent to the standard definition. Quite as good really. So we could regard it as a harmless eccentricity. It is, however, a pain when trying to get the answer in the back of the book or to communicate with anyone familiar with the standard definition.

- It also reports the confidence level you asked for rather than the
confidence level
*actually achieved*, which may be rather different. - As we shall see when we come to ties (next section, not yet written),
this is not the only PBD in
`wilcox.test`

.