wilcox.test
wilcox.exact
To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.
NA
removal in lines two and three is that
Rweb insists on reading variables of the same length, so whichever of
x
or y
is shorter must
be padded with NA
(not applicable) values.
w
is the Wilcoxon form defined in
equation (4.3) in Hollander and Wolfe.
u
is the Mann-Whitney form defined in
equation (4.15) in Hollander and Wolfe.
1 - pwilcox(u - 1, nx, ny) pwilcox(u - 1, nx, ny, lower.tail=FALSE) pwilcox(nx * ny - u, nx, ny)
wilcox.exact
function below.
The Hodges-Lehmann estimator associated with the rank sum test is the median of the pairwise differences, which are the nx ny differences
Very similar to the confidence intervals associated with the sign test and signed rank test, the confidence interval has the form
where m = nx ny is the number of pairwise differences, the Di are the pairwise differences, and, as always, parentheses on subscripts indicates order statistics. That is, one counts in k from each end in the list of sorted pairwise differences to find the confidence interval.
1 - 2 * pwilcox(k - 1, nx, ny)for different values of
k
. The vectorwise operation of R
functions can give them all at once
k <- seq(1, 100) conf <- 1 - 2 * pwilcox(k - 1, nx, ny) conf[conf > 1 / 2]If one adds these lines to the form above, one sees that the choice is fairly restricted. There are nine possible achieved levels between 0.99 and 0.80 are
k
to be any integer between
one and n / 2
just before the second to last line in the form
(cat ...
). A confidence interval with some
achieved confidence level will be produced.
alpha
rather than
alpha / 2
in the fifth line of the form. Then make either
the lower limit minus infinity or the upper limit plus infinity, as desired.
wilcox.test
All of the above can be done in one shot with the R function
wilcox.test
(on-line help).
Only one complaint. It does not report the actual achieved confidence level
(here 96.0%) but rather the confidence level asked for (here 95%, the default).
If you want to know the actual achieved confidence level, you'll have to
use the code in the confidence interval section above.
But you can use wilcox.test
as a convenient check
(the intervals should agree).
Do not use the wilcox.test
function when there are ties or
zeros in the data. See the following section.
wilcox.exact
There is an R function wilcox.exact
(on-line help) that does
do hypothesis tests correctly in the presence of ties.
It does not do confidence intervals or point estimates correctly in the presence of ties. Use the code in the confidence interval section or the point estimate section above.
In order to see what's going on, let's copy some of the code from
the beginning of the calculation without the function.
This shows the ranks so we see the tied ranks and shows the calculation
of the test statistic u
so we can see that it agrees
with the test statistic calculated by wilcox.exact
.