next up previous
Up: Stat 5101

Stat 5101 (Geyer) Midterm 1

Problem 1

The correlation is given by

\begin{displaymath}\mathop{\rm cor}\nolimits(X, Y) = \frac{\mathop{\rm cov}\noli...
...rt{\mathop{\rm var}\nolimits(X) \mathop{\rm var}\nolimits(Y)}}
\end{displaymath}

So in order to calculate this we need to first do two subproblems. The second is simpler. By the addition rule for variances of uncorrelated random variables (the variance of a sum is the sum of the variances)

\begin{displaymath}\mathop{\rm var}\nolimits(Y) = \mathop{\rm var}\nolimits(X + ...
...m var}\nolimits(X) + \mathop{\rm var}\nolimits(Z) = 2 \sigma^2
\end{displaymath}

For the first, we need the rule for taking a sum outside a covariance and the rule $\mathop{\rm cov}\nolimits(X, X) = \mathop{\rm var}\nolimits(X)$

\begin{displaymath}\mathop{\rm cov}\nolimits(X, Y) = \mathop{\rm cov}\nolimits(X...
... cov}\nolimits(X, Z) = \mathop{\rm var}\nolimits(X) = \sigma^2
\end{displaymath}

Thus

\begin{displaymath}\mathop{\rm cor}\nolimits(X, Y) = \frac{\sigma^2}{\sqrt{\sigma^2 \cdot 2 \sigma^2}}
= \frac{1}{\sqrt{2}}
\end{displaymath}

A Comment

You can if you like always use the formulas
\begin{align*}\mathop{\rm var}\nolimits(X) & = E(X^2) - E(X)^2 \\
\mathop{\rm cov}\nolimits(X, Y) & = E(X Y) - E(X) E(Y)
\end{align*}
to calculate variances and covariances. They are, after all, valid formulas, so, if you don't make any mistakes, you can use them to get correct answers.

But using these formulas are a bad idea in a problem like this. Note that the easy solution we give here does not involve means at all! If you do the problem the easy way, you cannot get an incorrect answer involving $\mu$.

If you do the problem using the formulas involving means and you make a mistake, the means won't cancel out. It's just much simpler if you use the rules for variances and covariances that don't involve means.

\begin{displaymath}\mathop{\rm var}\nolimits\left(\sum_{i = 1}^n X_i\right) = \sum_{i = 1}^n \mathop{\rm var}\nolimits(X_i)
\end{displaymath}

if (as in this problem) the Xi are independent, and

\begin{displaymath}\mathop{\rm cov}\nolimits\left(\sum_{i = 1}^m X_i, \sum_{j = ...
...m_{i = 1}^m \sum_{j = 1}^n \mathop{\rm cov}\nolimits(X_i, Y_i)
\end{displaymath}

Problem 2

This is a Bayes rule problem. The conditional probability in question can be calculated from the Bayes rule formula

\begin{displaymath}P(A \vert B) = \frac{P(B \vert A) P(A)}{P(B \vert A) P(A) + P(B \vert A^c) P(A^c)}
\end{displaymath}

Since the conditional probability wanted is $P(\text{HIV infection} \mid \text{positive test})$ we define Note that then we have

The probabilities given in the problem statement are

From this we can calculate the other quantities needed for the formula
\begin{align*}P(A^c) & = 1 - P(A) = 0.97 \\
P(B \vert A) & = 1 - P(B^c \vert A) = 0.98
\end{align*}
The second line here is a bit tricky. To see it you must remember that a conditional probability is just an ordinary probabilty considered as a function of the variable in front of the bar with the variable behind the bar fixed. Thus the complement rule involves complementing the event in front of the bar.

Plugging into the formula gives

\begin{displaymath}P(A \vert B)
=
\frac{0.98 \times 0.03}{0.98 \times 0.03 + 0.04 \times 0.97}
=
\frac{294}{682}
=
0.431085
\end{displaymath}

Problem 3

(a)

This distribution is symmetric about zero. Hence the mean is the center of symmetry if the mean exists, which it does since X is bounded ($\vert X\vert \le 1$) and every bounded random variable has an expectation.

(b)

The center of symmetry, zero, is also the median.

(c)

By definition, the c. d. f. is

\begin{displaymath}F(x) = P(X \le x) = \int_{- \infty}^x f(u) \, d u.
\end{displaymath}

In practice, we must take into account the support of X, which is the interval (-1, +1). Clearly X cannot be less than -1 or greater than +1. Hence for $x \le -1$

\begin{displaymath}F(x) = P(X \le x) = 0
\end{displaymath}

and for $x \ge +1$

\begin{displaymath}F(x) = P(X \le x) = 1 - P(X > x) = 1
\end{displaymath}

This is true for any random variable with bounded support: F(x) = 0for all x below the support and F(x) = 1 for all x above the support.

The only remaining task is to find the functional form of F on the support. Now for -1 < x < + 1

\begin{displaymath}F(x)
=
\int_{- \infty}^x f(u) \, d u
=
\int_{- \infty}^{-1} f(u) \, d u + \int_{- 1}^x f(u) \, d u.
\end{displaymath}

and the first term on the right is zero, because f(x) = 0 for x not in the support. Thus

\begin{displaymath}F(x)
=
\int_{- 1}^x f(u) \, d u
=
\left. \frac{3 s - s^3}{4} \right\vert _{-1}^x
=
\frac{3 x - x^3 + 2}{4}
\end{displaymath}

Summing up

\begin{displaymath}F(x) = \begin{cases}0, & x \le -1 \\
\tfrac{1}{4} (3 x - x^3 + 2), & -1 < x < +1 \\
1, & x \ge +1
\end{cases}\end{displaymath}

Problem 4

This is a job for the ``change of variable'' theorem (Theorem 8 of Section 3.5 in the textbook). The transformation is Y = g(X)with $g(x) = \sqrt{x}$. This transformation is invertable when x is restricted to the positive half line.

To simplify notation, write, as we did in class h = g-1, then h(y) = y2, and h'(y) = 2 y. Then the change of variable theorem says
\begin{align*}f_Y(y)
& =
f_X[h(y)] \cdot \vert h'(y)\vert
\\
& =
f_X[y^2] ...
...
2 y \cdot \tfrac{1}{2} (y^2)^2 e^{- y^2}
\\
& =
y^5 e^{- y^2}
\end{align*}
To be precise, we should add the domain of definition

\begin{displaymath}f_Y(y) = y^5 e^{- y^2}, \qquad y > 0.
\end{displaymath}

Problem 5

This is a job for the ``conditional expectation as renormalization'' formula given in the notes

\begin{displaymath}E(Y \vert X = x) = \frac{\int y f(x, y) \, d y}{\int \hphantom{y} f(x, y) \, d y}
\end{displaymath}

Note that the variable of integration is (1) the same in the numerator and denominator, and (2) is y leaving the result a function of x and not yin accordance with the rule that a conditional expection is a function of the variables behind the bar and not a function of those in front.

Note that the normalization constant 1 / 6 is irrelevant, cancelling out of the numerator and denominator. We could just as well use

\begin{displaymath}E(Y \vert X = x) = \frac{\int y h(x, y) \, d y}{\int \hphantom{y} h(x, y) \, d y}
\end{displaymath}

where

h(x, y) = (x + y)2 e- x - y

is an unnormalized version of the p. d. f.

The numerator and denominator are similar, both of the form

\begin{displaymath}\int_0^\infty y^k (x + y)^2 e^{- x - y} \, d y,
\end{displaymath}

the numerator being the case k = 1 and the denominator the case k = 0. Thus we can do both at once.
\begin{align*}\int_0^\infty y^k (x + y)^2 e^{- x - y} \, d y
& =
\int_0^\infty...
... [k! \cdot x^2 + 2 \cdot (k + 1)! \cdot x + (k + 2)!] \cdot e^{- x}
\end{align*}
All three integrals in the third line can be evaluated using the hint.

Plugging in k = 0 gives

[x2 + 2 x + 2] e- x

for the denominator, and plugging in k = 1 gives

[x2 + 4 x + 6] e- x

for the numerator. The factors e- x cancel, giving

\begin{displaymath}E(Y \vert X = x) = \frac{x^2 + 4 x + 6}{x^2 + 2 x + 2}
\end{displaymath}

or

\begin{displaymath}E(Y \vert X) = \frac{X^2 + 4 X + 6}{X^2 + 2 X + 2}
\end{displaymath}


next up previous
Up: Stat 5101
Charles Geyer
1999-10-29