next up previous
Next: About this document ... Up: Review of Chapter 7 in Previous: Large Sample Theory

Subsections

Confidence Intervals

In Chapter 7 Wild and Seber call these ``two standard error intervals,'' but in Chapter 8 we find out they are really called confidence intervals.

Interval Estimates and Point Estimates

The thingies discussed in this section are called interval estimates. For contrast, the estimates previously discussed, like $ \bar{x}$ and $ \hat{p}$ are called point estimates.

Large Sample (Approximate) Intervals

For any point estimate having an approximately normal sampling distribution

   point estimate$\displaystyle \pm 2 \mathop{\rm se}\nolimits ($point estimate$\displaystyle )
$

is an approximate 95% confidence interval for the parameter that the point estimate estimates.

Generically,

$\displaystyle \hat{\theta} \pm 2 \mathop{\rm se}\nolimits (\hat{\theta})
$

is an approximate 95% confidence interval for $ \theta$.

And

\begin{displaymath}
\begin{split}
\bar{x} & \pm 2 \mathop{\rm se}\nolimits (\ba...
... 2 \mathop{\rm se}\nolimits (\hat{p}_1 - \hat{p}_2)
\end{split}\end{displaymath}

are approximate 95% confidence intervals for the parameters that the point estimates estimate, when the appropriate conditions are satisfied.

Plugging in the formulas for the standard errors,

$\displaystyle \bar{x} \pm 2 \frac{s_X}{\sqrt{n}}
$

is an approximate 95% confidence interval for $ \mu$,

$\displaystyle \hat{p} \pm 2 \sqrt{\frac{\hat{p} (1 - \hat{p})}{n}}
$

is an approximate 95% confidence interval for $ p$,

$\displaystyle \bar{x}_1 - \bar{x}_2 \pm 2 \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}
$

is an approximate 95% confidence interval for $ \mu_1 - \mu_2$, and

$\displaystyle \hat{p}_1 - \hat{p}_2
\pm
2 \sqrt{\frac{\hat{p}_1 (1 - \hat{p}_1)}{n}
+ \frac{\hat{p}_2 (1 - \hat{p}_2)}{n}}
$

is an approximate 95% confidence interval for $ p_1 - p_2$.

Note:

For confidence levels other than 95% see Section 1.4.4 below.

Small Sample (Exact) Intervals

Nothing in the preceeding section is useful for small samples. For proportions there is no small sample theory. But for means there is. In Chapter 7 we only do the one-sample case (the two-sample case will come later).

The confidence interval for $ \mu$ in the preceeding section, was derived from the fact that

$\displaystyle T = \frac{X{\mkern -13.5 mu}\overline{\phantom{\text{X}}}- \mu}{S_X / \sqrt{n}} \approx \text{Normal}(0, 1)$ (1)

where the ``double wiggle'' sign means ``approximately distributed as.'' An exact (small sample) confidence interval for $ \mu$ can be derived from the analogous fact that

$\displaystyle T = \frac{X{\mkern -13.5 mu}\overline{\phantom{\text{X}}}- \mu}{S_X / \sqrt{n}} \sim \text{Student}(n - 1)$ (2)

where the ``single wiggle'' sign means ``exactly distributed as'' and Student$ (d)$ means the Student's $ t$-distribution with $ d$ degrees of freedom.

The difference between the two theories is that

The relation between (1) and the confidence interval is that the two equations

\begin{displaymath}
\begin{split}
\mathop{\rm pr}\nolimits \left( -2 < \frac{X{...
...}}}+ 2 \frac{S_X}{\sqrt{n}}
\right) & \approx 0.95
\end{split}\end{displaymath}

are equivalent, and the latter is the claim made for the confidence interval.

Hence in order to get exact (not approximate) confidence intervals assuming a normal population distribution we only need to substitute for 2 the $ t$ such that

$\displaystyle \mathop{\rm pr}\nolimits ( -t < T < t) = 0.95$ (3)

where $ T \sim$   Student$ (n - 1)$. This $ t$ is called the $ t$ critical value for 95% confidence and is different for each sample size $ n$.

The $ t$ critical values for 95% confidence are given in the column headed 0.025 of Appendix 6 in Wild and Seber or by either of the R commands


qnorm(0.975, n - 1)
- qnorm(0.025, n - 1)
where n is the sample size (so n - 1 is the degrees of freedom).

For example, if $ n = 10$, then

$\displaystyle \bar{x} \pm 2.262 \frac{s_X}{\sqrt{n}}
$

is an exact 95% confidence interval for $ \mu$, and if $ n = 5$, then

$\displaystyle \bar{x} \pm 2.571 \frac{s_X}{\sqrt{n}}
$

is an exact 95% confidence interval for $ \mu$.

Warning:

Note:

For confidence levels other than 95% see Section 1.4.4 below.


Different Confidence Levels

For confidence levels other than 95%, just change the 0.95 in (3) to some other number.

To get

$ 100 (1 - \alpha) \%$ confidence
the $ t$ critical value is
the $ 1 - \alpha / 2$ quantile of the Student$ (n - 1)$ distribution
or
minus the $ \alpha / 2$ quantile of the Student$ (n - 1)$ distribution.
Thus
confidence level column of Appendix 6 headed
90% 0.05
95% 0.025
99% 0.005
and so forth.

Approximate Large-Sample Intervals

The same trick works for large-sample intervals based on the approximate normality of the sampling distribution of a point estimate. Just use the Normal$ (0, 1)$ distribution instead of the Student$ (n - 1)$ distribution. This the Student's $ t$-distribution with ``infinity degrees of freedom'' in the bottom row of Appendix 6 in Wild and Seber. Hence

confidence level $ z$ critical value
90% 1.645
95% 1.960
99% 2.576
(We call it a $ z$ critical value rather than a $ t$ critical value because a standard normal random variable is often denoted $ Z$.)

Note also that a finicky person also uses 1.96 s. e. intervals rather than 2 s. e. intervals for 95% confidence (not that it really matters, it's only approximate anyway).


next up previous
Next: About this document ... Up: Review of Chapter 7 in Previous: Large Sample Theory
Charles Geyer
2000-10-30