University of Minnesota, Twin Cities     School of Statistics     Stat 5601     Rweb     Computing Examples

Stat 5601 (Geyer) Examples (Survival Analysis)

Contents

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.

Exponential versus IFR or DFR

Hypothesis Tests

Example 11.1 in Hollander and Wolfe.

External Data Entry

Enter a dataset URL :

IFR Point Estimate

Source: Marshall and Proschan (1965), Annals of Mathematical Statistics 65:69-77, pp. 70-71, esp. equations (3.6) and (3.2).

External Data Entry

Enter a dataset URL :

Summary

rate lower limit upper limit
0.0000 -∞ 42
0.0210 42 61
0.0333 61 66
0.0566 66 81
0.5000 81 82
82

The IFR point estimate is a function (the rate function), as in many cases, the nonparametric function estimate is a step function (like the empirical c. d. f.). Failure rate infinity past x = 82 means all individuals surviving to that time fail immediately. Similarly, failure rate zero before x = 42, means no failures occur before then.

Thus the failure time distribution is concentrated on the observed range of the data 42 < x < 82. For comparison, the estimator assuming constant failure rate on (0, &infin), the exponential failure time distribution, has failure rate 0.0154.

DFR Point Estimate

There is a similar DFR point estimate, also given by Marshall and Proschan (1965) cited above. Since we have decided that this example is IFR rather than DFR, we will skip it.

Kaplan-Meier

Point Estimate (Survival Curve)

The Kaplan-Meier survival curve is estimated using the survfit function in the survival library in R ( on-line help).

Example 11.7 in Hollander and Wolfe.

External Data Entry

Enter a dataset URL :

Confidence Interval

Example 11.7 in Hollander and Wolfe.

External Data Entry

Enter a dataset URL :

Comment

This is a pointwise not (!) simultaneous confidence interval for the curve. Hollander and Wolfe describe simultaneous confidence bands for the curve, but apparently the survival package in R does not implement them. (I have no idea why.)

Hypothesis Test

The log-rank or Mantel-Haenszel test of whether there is a difference between two or more survival curves is performed using the survdiff function in the survival library in R ( on-line help).

Example 11.7 in Hollander and Wolfe.

External Data Entry

Enter a dataset URL :

Summary

P = 0.00115 (Mantel-Haenzel test).

Comment

The reason this disagrees with the book (Hollander and Wolfe, Section 11.7, page 553) is that Hollander and Wolfe do a one-tailed test, and the survdiff function only does two-tailed tests.

Of course, one can always convert between the two using two tails is twice one tail. Indeed Hollander and Wolfe's P-value is half of R's.