General Instructions
To do each example, just click the Submit
button.
You do not have to type in any R instructions or specify a dataset.
That's already done for you.
The Wilcoxon Rank Sum Test
Example 4.1 in Hollander and Wolfe.
Summary
- Lower-tailed rank sum test
- Test statistic: W = 30 (Wilcoxon form)
- Test statistic: U = 15 (Mann-Whitney form)
- Sample sizes: nx = 10, ny = 5
- P-value: P = 0.1272061
Comments
- Line one assigns the value of the parameter (population median difference) assumed under the null hypothesis. Usually zero.
- The reason for the
NA
removal in lines two and three is that Rweb insists on reading variables of the same length, so whichever ofx
ory
is shorter must be padded withNA
(not applicable) values. - The test statistic
w
is the Wilcoxon form defined in equation (4.3) in Hollander and Wolfe. - The test statistic
u
is the Mann-Whitney form defined in equation (4.15) in Hollander and Wolfe. - In the last line we see that the R function giving the probability distribution of the test statistic under the null hypothesis uses the Mann-Whitney form. So we have to use it too, although Hollander and Wolfe use the other for most of their discussion.
- For an upper-tailed test the last line would be replaced by
any of the following, which all do the same thing.
1 - pwilcox(u - 1, nx, ny) pwilcox(u - 1, nx, ny, lower.tail=FALSE) pwilcox(nx * ny - u, nx, ny)
- For a two-tailed test do both the lower-tailed and the upper-tailed test and double the P-value of the smaller of the two results. (Two tails is twice one tail because of the symmetry of the null distribution of the test statistic.)
- For handling zeros and tied ranks, see Hollander and Wolfe,
the class discussion, and the the section on
the
wilcox.exact
function below.
The Associated Point Estimate (Median of the Pairwise Differences)
The Hodges-Lehmann estimator associated with the rank sum test is the median of the pairwise differences, which are the nx ny differences
Example 4.3 in Hollander and Wolfe.
Summary
- Point Estimate (sample median of pairwise differences): −0.305
The Associated Confidence Interval
Very similar to the confidence intervals associated with the sign test and signed rank test, the confidence interval has the form
where m = nx ny is the number of pairwise differences, the Di are the pairwise differences, and, as always, parentheses on subscripts indicates order statistics. That is, one counts in k from each end in the list of sorted pairwise differences to find the confidence interval.
Example 4.4 in Hollander and Wolfe.
Summary
- Achieved confidence level: 96.004%
- Confidence interval for the population median difference: (−0.76, 0.15)
Comments
- Some experimentation may be needed to achieve the confidence level
you want. The possible confidence levels are shown by
1 - 2 * pwilcox(k - 1, nx, ny)
for different values ofk
. The vectorwise operation of R functions can give them all at oncek <- seq(1, 100) conf <- 1 - 2 * pwilcox(k - 1, nx, ny) conf[conf > 1 / 2]
If one adds these lines to the form above, one sees that the choice is fairly restricted. There are nine possible achieved levels between 0.99 and 0.80 are0.9873, 0.9807, 0.9720, 0.9600, 0.9447, 0.9247, 0.9008, 0.8708, 0.8355 - Alternatively, you can just assign
k
to be any integer between one andn / 2
just before the second to last line in the form (cat ...
). A confidence interval with some achieved confidence level will be produced. - For a one-tailed confidence interval (called upper and lower bounds by
Hollander and Wolfe) just use
alpha
rather thanalpha / 2
in the fifth line of the form. Then make either the lower limit minus infinity or the upper limit plus infinity, as desired.
The R Function wilcox.test
All of the above can be done in one shot with the R function
wilcox.test
(on-line help).
Only one complaint. It does not report the actual achieved confidence level
(here 96.0%) but rather the confidence level asked for (here 95%, the default).
If you want to know the actual achieved confidence level, you'll have to
use the code in the confidence interval section above.
But you can use wilcox.test
as a convenient check
(the intervals should agree).
Warning About Ties and Zeros
Do not use the wilcox.test
function when there are ties or
zeros in the data. See the following section.
The R Function wilcox.exact
There is an R function wilcox.exact
(on-line help) that does
do hypothesis tests correctly in the presence of ties.
It does not do confidence intervals or point estimates correctly in the presence of ties. Use the code in the confidence interval section or the point estimate section above.
In order to see what's going on, let's copy some of the code from
the beginning of the calculation without the function.
This shows the ranks so we see the tied ranks and shows the calculation
of the test statistic u
so we can see that it agrees
with the test statistic calculated by wilcox.exact
.
Fuzzy Procedures
These are analogous to the fuzzy procedures for the sign test explained on the sign test and related procedures page and on the fuzzy confidence intervals and P-values page.
Since they are so similar, we won't belabor the issues and interpretations. The only difference is that for fuzzy confidence intervals the jumps in the plot are at Y − X differences (no surprise) rather than at order statistics and for fuzzy P-values they are at numbers in the CDF table for the null distribution of the test statistic, which is now the Mann-Whitney distribution rather than the symmetric binomial distribution.
In short, the distributions have changed but everything else remains the same.
Fuzzy P-Values
The fuzzy P-value is (not very uniformly) distributed over the interval from 0.0041 to 0.0175. This is fairly strong evidence against the null hypothesis.
The only virtue this procedure has over the wilcox.exact
procedure illustrated in the preceding section
is that this procedure is exact at all significance levels,
whereas the wilcox.exact
only gives an exact procedure
for the significance levels that appear in the
CDF table
of the null distribution of the test statistic (which is not tabulated
anywhere, just calculated by wilcox.exact
, since the presence
of tied ranks changes the distribution, so it is not the distribution
calculated by pwilcox
or tabulated in the textbook).
Both procedures are exact
in some sense (not exactly the same
sense). Both say more or less the same thing. Certainly, fairly strong
evidence against the null hypothesis
is what both say.
You can use whichever you like. What you should not do is follow
traditional procedures, described by the textbook and implemented in
wilcox.test
, when there are ties.