To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.
The second analysis done by the aov
function is the
usual parametric procedure: one-way ANOVA. It produces
P = 0.5866 for comparison with the Kruskal-Wallis P-value.
Unfortunately, R doesn't have this procedure. So we'll have to do it
by hand
(in R).
Rather than use large sample approximation
on what are really
small sample sizes, we do a Monte Carlo
calculation of the P-value
(that is, we compute by simulation of null distribution of the test statistic).
The Monte Carlo calculation is the loop
for (i in 1:nsim) { datsim <- sample(dat, length(dat)) jsim[i] <- jkstat(datsim, grp) }
This does nsim
simulations of the null distribution of the
test statistic. The first line of the body of the loop generates a new
simulated data set datsim
which is a permutation of the
actual data (same numbers, just assigned to different groups). The second
line of the body of the loop calculates the value of the test statistic
for the simulated data and stores it for future use.
After the loop has finished jsim
is a vector of
length nsim
that consists of independent, identically distributed
random variables having the distribution of the test statistic J
under the null hypothesis. And
phat <- mean(jsim >= jstat)
approximates the P-value, which is Pr(J ≥ j).
The slightly more tricky code
(nsim * phat + 1) / (nsim + 1)
includes the observed value in the numerator and denominator.
As explained in class, this assures that if α is a multiple of
1 / nsim
, then Pr(P ≤ α) is indeed α,
despite the Monte Carlo.
Despite having an exact Monte Carlo test (exact
meaning level α
really means level α), there is some interest in the randomness in
the reported P-value. Hence the next to last line calculates its standard
error.
The last line reports the time the calculation takes: the first number is the elapsed time in seconds.
This is the normal-theory competitor to Jonckheere-Terpstra.
Unfortunately, R doesn't have this procedure. So we'll have to do it
by hand
(in R).
The result quoted in the summary uses 10 times the sample size entered in the example form above. It was run off-line (R rather than Rweb) with the following output.
The R function pava
implements the pool adjacent violators
algorithm which does isotonic regression. In the famous words of a UNIX
source code comment you are not expected to understand this.
The main lesson here is that the normal theory test (isotonic regression) does more or less the same as the nonparametric Jonckheere-Terpstra test. This is no surprise since the data are fairly normal looking.