Confidence Intervals

Next: About this document ... Up: Review of Chapter 7 in Previous: Large Sample Theory

Subsections

Confidence Intervals

In Chapter 7 Wild and Seber call these ``two standard error intervals,'' but in Chapter 8 we find out they are really called confidence intervals.

Interval Estimates and Point Estimates

The thingies discussed in this section are called interval estimates. For contrast, the estimates previously discussed, like $\bar{x}$ and $\hat{p}$ are called point estimates.

Large Sample (Approximate) Intervals

For any point estimate having an approximately normal sampling distribution

point estimate $\displaystyle \pm 2 \mathop{\rm se}\nolimits ($ point estimate $\displaystyle )$

is an approximate 95% confidence interval for the parameter that the point estimate estimates.

Generically,

$\displaystyle \hat{\theta} \pm 2 \mathop{\rm se}\nolimits (\hat{\theta})$

is an approximate 95% confidence interval for $\theta$ .

And

$\begin{displaymath} \begin{split} \bar{x} & \pm 2 \mathop{\rm se}\nolimits (\ba... ... 2 \mathop{\rm se}\nolimits (\hat{p}_1 - \hat{p}_2) \end{split}\end{displaymath}$

are approximate 95% confidence intervals for the parameters that the point estimates estimate, when the appropriate conditions are satisfied.

The sample size is large (both sample sizes are large in the two-sample cases).
In the two-sample cases, the samples are independent.

Plugging in the formulas for the standard errors,

$\displaystyle \bar{x} \pm 2 \frac{s_X}{\sqrt{n}}$

is an approximate 95% confidence interval for $\mu$ ,

$\displaystyle \hat{p} \pm 2 \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}$

is an approximate 95% confidence interval for

$\displaystyle \bar{x}_1 - \bar{x}_2 \pm 2 \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$

is an approximate 95% confidence interval for $\mu_1 - \mu_2$ , and

$\displaystyle \hat{p}_1 - \hat{p}_2 \pm 2 \sqrt{\frac{\hat{p}_1 (1 - \hat{p}_1)}{n} + \frac{\hat{p}_2 (1 - \hat{p}_2)}{n}}$

is an approximate 95% confidence interval for

Note:

For confidence levels other than 95% see Section 1.4.4 below.

Small Sample (Exact) Intervals

Nothing in the preceeding section is useful for small samples. For proportions there is no small sample theory. But for means there is. In Chapter 7 we only do the one-sample case (the two-sample case will come later).

The confidence interval for $\mu$ in the preceeding section, was derived from the fact that

$\displaystyle T = \frac{X{\mkern -13.5 mu}\overline{\phantom{\text{X}}}- \mu}{S_X / \sqrt{n}} \approx \text{Normal}(0, 1)$

(1)

where the ``double wiggle'' sign means ``approximately distributed as.'' An exact (small sample) confidence interval for $\mu$ can be derived from the analogous fact that

$\displaystyle T = \frac{X{\mkern -13.5 mu}\overline{\phantom{\text{X}}}- \mu}{S_X / \sqrt{n}} \sim \text{Student}(n - 1)$

(2)

where the ``single wiggle'' sign means ``exactly distributed as'' and Student

means the Student's

-distribution with

degrees of freedom.

The difference between the two theories is that

(1) holds (approximately) regardless of the population distribution for sufficiently large sample size .
(2) holds (exactly) for a normal population distribution regardless of the sample size .

The relation between (1) and the confidence interval is that the two equations

$\begin{displaymath} \begin{split} \mathop{\rm pr}\nolimits \left( -2 < \frac{X{... ...}}}+ 2 \frac{S_X}{\sqrt{n}} \right) & \approx 0.95 \end{split}\end{displaymath}$

are equivalent, and the latter is the claim made for the confidence interval.

Hence in order to get exact (not approximate) confidence intervals assuming a normal population distribution we only need to substitute for 2 the such that

$\displaystyle \mathop{\rm pr}\nolimits ( -t < T < t) = 0.95$

(3)

where $T \sim$ Student

. This

is called the

critical value for 95% confidence and is different for each sample size

The critical values for 95% confidence are given in the column headed 0.025 of Appendix 6 in Wild and Seber or by either of the R commands


qnorm(0.975, n - 1)
- qnorm(0.025, n - 1)

where n is the sample size (so n - 1 is the degrees of freedom).

For example, if , then

$\displaystyle \bar{x} \pm 2.262 \frac{s_X}{\sqrt{n}}$

is an exact 95% confidence interval for $\mu$ , and if

, then

$\displaystyle \bar{x} \pm 2.571 \frac{s_X}{\sqrt{n}}$

is an exact 95% confidence interval for $\mu$ .

Warning:

These intervals are exact only if the population distribution is exactly normal.
If the population distribution is close to but not exactly normal, then the these intervals are approximate (their actual coverage probability is near their nominal 95% coverage probability).
If the population distribution is nowhere near normal, then these intervals are totally bogus.

Note:

For confidence levels other than 95% see Section 1.4.4 below.

Different Confidence Levels

For confidence levels other than 95%, just change the 0.95 in (3) to some other number.

To get

$100 (1 - \alpha) \%$ confidence

the

critical value is

the $1 - \alpha / 2$ quantile of the Student distribution

minus the $\alpha / 2$ quantile of the Student distribution.

Thus

confidence level	column of Appendix 6 headed
90%	0.05
95%	0.025
99%	0.005

and so forth.

Approximate Large-Sample Intervals

The same trick works for large-sample intervals based on the approximate normality of the sampling distribution of a point estimate. Just use the Normal distribution instead of the Student distribution. This the Student's -distribution with ``infinity degrees of freedom'' in the bottom row of Appendix 6 in Wild and Seber. Hence

confidence level	critical value
90%	1.645
95%	1.960
99%	2.576

(We call it a

critical value rather than a

critical value because a standard normal random variable is often denoted

Note also that a finicky person also uses 1.96 s. e. intervals rather than 2 s. e. intervals for 95% confidence (not that it really matters, it's only approximate anyway).

Next: About this document ... Up: Review of Chapter 7 in Previous: Large Sample Theory

Charles Geyer
2000-10-30