Statistics 5601 (Geyer, Spring 2006) Examples: Survival Analysis

General Instructions
Exponential versus IFR or DFR
Kaplan-Meier

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.

Exponential versus IFR or DFR

IFR Point Estimate

Source: Marshall and Proschan (1965), Annals of Mathematical Statistics 65:69-77, pp. 70-71, esp. equations (3.6) and (3.2).

Summary

rate	lower limit	upper limit
0.0000	-∞	42
0.0210	42	61
0.0333	61	66
0.0566	66	81
0.5000	81	82
∞	82	∞

The IFR point estimate is a function (the rate function), as in many cases, the nonparametric function estimate is a step function (like the empirical c. d. f.). Failure rate infinity past x = 82 means all individuals surviving to that time fail immediately. Similarly, failure rate zero before x = 42, means no failures occur before then.

Thus the failure time distribution is concentrated on the observed range of the data 42 < x < 82. For comparison, the estimator assuming constant failure rate on (0, &infin), the exponential failure time distribution, has failure rate 0.0154.

DFR Point Estimate

There is a similar DFR point estimate, also given by Marshall and Proschan (1965) cited above. Since we have decided that this example is IFR rather than DFR, we will skip it.

Kaplan-Meier

Point Estimate (Survival Curve)

The Kaplan-Meier survival curve is estimated using the survfit function in the survival library in R ( on-line help).

Example 11.7 in Hollander and Wolfe.

Confidence Interval

Example 11.7 in Hollander and Wolfe.

Comment

This is a pointwise not (!) simultaneous confidence interval for the curve. Hollander and Wolfe describe simultaneous confidence bands for the curve, but apparently the survival package in R does not implement them. (I have no idea why.)

Single Confidence Interval

Sometimes you just want the interval for one time, say 1000 days. The summary.survfit function (on-line help) does that, as shown in the last line of the example above.

Hypothesis Test

The log-rank or Mantel-Haenszel test of whether there is a difference between two or more survival curves is performed using the survdiff function in the survival library in R ( on-line help).

Example 11.7 in Hollander and Wolfe.

Summary

P = 0.00115 (Mantel-Haenszel test).

Comment

The reason this disagrees with the book (Hollander and Wolfe, Section 11.7, page 553) is that Hollander and Wolfe do a one-tailed test, and the survdiff function only does two-tailed tests.

Of course, one can always convert between the two using two tails is twice one tail. Indeed Hollander and Wolfe's P-value is half of R's.

Statistics 5601 (Geyer, Spring 2006) Examples: Survival Analysis

Contents

General Instructions

Exponential versus IFR or DFR

Hypothesis Tests

Example 11.1 in Hollander and Wolfe.

IFR Point Estimate

Summary

DFR Point Estimate

Kaplan-Meier

Point Estimate (Survival Curve)

Example 11.7 in Hollander and Wolfe.

Confidence Interval

Example 11.7 in Hollander and Wolfe.

Comment

Single Confidence Interval

Hypothesis Test

Example 11.7 in Hollander and Wolfe.

Summary

Comment