One-Way Contingency Table
The data set
simulates 6000 rolls of a fair die (singular of dice). We test the hypothesis that all six cells of the contingency table have the same probability (null hypothesis) versus that they are different (alternative hypothesis).
The following R statements do this test two different ways.
The tests are asymptotically equivalent so it is no surprise that the test statistics are very similar (4.904 and 4.895), as are the P-values (0.4277 and 0.4289).
Since the P-values are not small, the null hypothesis is accepted. The die seems fair.
One-Way Contingency Table with Parameters Estimated
The data set
simulates 6000 rolls of an unfair die of the type known as six-ace flats. The one and six faces are shaved slightly so the other faces have smaller area and smaller probability. We start by testing the hypothesis that all six cells of the contingency table have the same probability.
The following R statements do this test two different ways.
Since the P-values are above 0.1, the null hypothesis is accepted. The die seems fair.
However, it is a bad idea to reject a hypothesis we haven't even fit yet. Suppose instead we do a likelihood ratio test of model comparison, comparing the six-ace flats hypothesis (one and six have the same probability, two, three, four, and five have the same probability) to hypothesis that all six cells have the same probability.
We find that the null hypothesis of equal probabilities is rejected and the six-ace flats hypothesis accepted (P = 0.038).
The same test can also be done assuming Poisson sampling rather than multinomial sampling. The likelihood ratio test statistic is the same and the degrees of freedom for the chi-square approximation are the same.
Two-Way Contingency Table
When there are two categorical predictors, we can also think of the contingency table as a two-dimensional array, one categorical predictor giving the row labels and the other giving the column labels.
Rweb does not like to read data as contingency tables, so we read it as usual.
The data set
has three variables, the response y
and two categorical predictors
color
and opinion
.
The following R code does the likelihood ratio test.
The test rejects the null hypothesis that the two categorical predictors have independent effects (P = 0.0159).
The analogous chi-square test requires us to put the data in a two-way array. The following R code does this test.
The R function xtabs
(on-line
help) converts data from the data frame format
read by Rweb and wanted by the lm
and glm
functions
to the contingency table (matrix) format
wanted by the chisq.test
function.