Go to assignment: 1 2 3 4 5 6 7 8 9 10 11 12
No. | Due Date | Sec. | Exercises | Comments |
---|---|---|---|---|
1 | Wed Jan 29 | 6.2 | 6, 7, 10 | |
6.3 | 5, 7, 12, 14 | |||
2 | Wed Feb 05 | 6.4 | 2, 6, 7, 8 | |
6.5 | 2, 6, 9, 10 | |||
3 | Wed Feb 12 | 6.6 | 2, 6 | |
A | 1 | additional problemnumber one (see below). | ||
7.1 | 2, 4, 6, 8 | |||
7.2 | 6, 10, 11 | |||
4 | Wed Feb 17 | 7.3 | 4, 6, 8 | |
7.4 | 1, 2, 6 | Alternatively, use Problem 6 on the 5101 final exam. | ||
7.5 | 2, 4, 6, 11 | |||
5 | Wed Mar 05 | 7.6 | 4, 8, 10, 11 | |
7.7 | 6, 11 | |||
A | 2, 3, 4 | additional problems(see below). | ||
6 | Wed Mar 12 | 7.8 | 2, 4, 6, 14 | |
A | 5, 6, 7, 8, 9, 10 | additional problems(see below). | ||
7 | Wed Mar 26 | 8.1 | 1, 4, 15 | |
8.5 | 2, 12, 14 | |||
8.6 | 2, 3, 7 | |||
A | 11 | additional problems(see below). | ||
8 | Wed Apr 02 | 8.7 | 2, 4, 7 | For 7 also find the P-value of the test. See the page about F tests. |
9.1 | 4, 7, 8 | |||
9 | Wed Apr 16 | 9.2 | 2, 6 | |
9.3 | 5 | |||
9.4 | 2 | |||
9.6 | 4, 9 | data are in
http://www.stat.umn.edu/geyer/old03/5102/examp/ds9-7.4.txt
and
http://www.stat.umn.edu/geyer/old03/5102/examp/ds9-7.9.txt. The answer in the back of the book uses the large sample approximation.
R doesn't. So R doesn't give the same answer unless you say
ks.test(x, y, exact = FALSE) .
| ||
A | 12 | additional problems(see below). | ||
10 | Wed Apr 23 | A | 13, 14 | additional problems(see below). |
10.1 | 4, 6, 7 | the data for 7 are in the file http://www.stat.umn.edu/geyer/old03/5102/examp/ds10-1-3.txt | ||
10.2 | 12, 16 | for 16 give the 95% prediction interval rather than M.S.E. | ||
11 | Wed Apr 30 | 10.3 | 10, 11 | the data for 10 and 11 are in the file http://www.stat.umn.edu/geyer/old03/5102/examp/ds10-9.txt |
A | 15, 16, 17, 18, 19 | additional problems(see below). | ||
12 | Fri May 9 | 10.6 | 10 | the data for 10 are in the file http://www.stat.umn.edu/geyer/old03/5102/examp/ds10-18.txt |
10.7 | 14, 15 | the data for 14 and 15 are in the file http://www.stat.umn.edu/geyer/old03/5102/examp/ds10-24.txt | ||
10.8 | 11, 12, 13 | the data for 11, 12, and 13 are in the file http://www.stat.umn.edu/geyer/old03/5102/examp/ds10-29.txt | ||
A | 20, 21 | additional problems(see below). |
1. Like the example of maximum likelihood done by computer except instead of the gamma scale model, we will use the Cauchy location model. The likelihood is given by (6.6.7) on p. 366 of DeGroot and Schervish. For data, use the URL
and for a starting point use the sample median rather than the sample mean,
that is, median(x)
instead of mean(x)
. The reason
for this will become clear later. The sample is a very bad estimate of
location for the Cauchy distribution.
2. Solve the quadratic equation to prove that the interval (2.18) in the handout does indeed have endpoints (2.19) in the handout.
3. Calculate the three kinds of intervals given by equations (2.20), (2.19), and (2.22) in the handout for binomial data with n = 50 and x = 4. Use 95% for the confidence coefficient.
4. Calculate the second and fourth central moments μ2 and μ4 in the notation of the handout for the so-called double exponential distribution with density
(note this distribution is symmetric about zero, so the mean is zero and all odd central moments are zero).
Compare the correct asymptotic variance of the sample variance μ4 − μ22 with the incorrect asymptotic variance of the sample variance 2 μ22 that we would get if we incorrectly assumed the data were normal. (Section 2.10 of the handout).
5.
Starting with the asymptotic distribution for
Sn2 given on p. 16 of the
more on confidence intervals handout
use the delta method to give
the asymptotic distribution of
Sn.
6.
Using the method of Section 1.2 of the more on confidence intervals
handout, find an exact 95% confidence interval for the mean
(not the rate)
parameter of an exponential distribution from which it is assumed we have
independent and identically distributed data with sample size 15 and
sample mean 103.49.
7.
Using the method of Section 2.9.2 of
the more on confidence intervals
handout,
find an asymptotic (approximate, large sample) 95% confidence interval
for the mean parameter of a Poisson distribution from which is assumed we have
independent and identically distributed data with sample size 50 and
sample mean 2.9.
Hint: In order to use plug-in
you need a consistent
estimator of the standard deviation of the Poisson distribution. What is
the standard deviation and what is its relation to the mean? The sample
mean consistently estimates the mean parameter. What does that suggest
for a consistent estimator of standard deviation?
8. Suppose we have an independent and identically distributed sample from a Geometric(p) distribution with sample size 30 and sample mean 7.8. Find the maximum likelihood estimate of p and a 95% confidence interval for p based on the MLE and either observed or expected Fisher information.
9. Like the example of multiparameter maximum likelihood done by computer except instead of the gamma scale-rate model, we will use the Cauchy location-scale model. The likelihood is given by
where
The R function
dcauchy(x, location = theta, scale = sigma)
calculates f(x | θ, σ),
returning a vector of values if x
is a vector.
For data, use the URL
Method of moments estimators make no sense for the Cauchy distribution because the Cauchy distribution doesn't have any moments. We have to use estimators based on quantiles instead.
For a starting point for theta
use the sample median
(as we did in additional problem 1).
This makes sense because θ is the theoretical median.
And for a starting
point for the scale parameter sigma
use half the sample
interquartile range, that is, 0.5 * IQR(x)
.
This makes sense because
the theoretical interquartile range is 2 σ.
Report the values you obtain for
10. Suppose the variables X1, X2, ..., Xn, Y1, Y2, ..., Yn are independent, and suppose the Xi are identically Exponential(θ) distributed and the Yi are identically Exponential(1 / θ) distributed.
mean(x)
and mean(y)
)
and numerically.
mean(x)
and mean(y)
)
and numerically.
11. Basically this is Problem 8.6.10 in DeGroot and Schervish. Use the data in their Table 8.1, which can be read into R with the statements
calcium <- c( 7, -4, 18, 17, -3, -5, 1, 10, 11, -2) placebo <- c(-1, 12, -1, -3, 3, -5, 5, 2, -11, -1, -3)
The web page on doing t-tests in R may help.
12. For the data in the URL
calculate the following point estimators
13. For the data in the URL
calculate confidence intervals for the center of symmetry (we assume the population distribution is symmetric about some point θ which is the unknown parameter of interest) associated with
having confidence level above 95% and as close to 95% as you can get
(this is what the wilcox.test
function does by default).
14. For the data in the URL
calculate P-values for an upper tailed test about the center of symmetry (we assume the population distribution is symmetric about some point θ which is the unknown parameter of interest) with null and alternative hypotheses
for each of the following types of test
(note: the t.test
and wilcox.test
functions do two-tailed tests by default so you must use the optional argument alternative = "greater"
to do an upper-tailed test).
15. For the data in the URL
which contains two variables x
and y
,
assume the data follow the simple linear regression model
16. For the data in the URL
which contains two variables x
and y
,
assume the pairs (Xi,
Yi) are independent and identically
bivariate normal distributed with correlation
17. For the data in the URL
which contains two variables x
and y
,
assume the data follow the simple linear regression model
Note: This is exactly the same as Additional Problem 15 (word for word) except that the hypothesized value of the regression coefficient is 0.6 rather than zero.
18. For the data in the URL
which contains two variables x
and y
,
assume the data follow the simple linear regression model
Note: This is exactly the same as Additional Problem 15 except that it is about the quadratic regression model rather than the simple linear model and the test is about β2 rather than about β1.
19. For the data in the URL
which contains two variables x
and y
,
it is clear from the scatter plot produced by plot(x, y)
that a simple linear regression will not fit the data (no statistics
needed, the points are obviously nowhere near a straight line).
From the scatter plot curves up at both ends, it is clear that a polynomial of even degree is needed for the regression function (assuming we restrict our consideration to polynomials), because a polynomial of odd degree would go up at one end and down at the other.
Report the regression coefficients for each model.
lty = 2
,
lty = 3
, and so forth to distinguish the lines).
Hand in the plot. Comment on the differences between the curves and the
relation to the results of the F tests.
20.
Modify the example calculating the MSE
of an estimator by simulation making two changes. Use the t
distribution
with 2.5 degrees of freedom for the distribution of the data (instead of
the standard Cauchy distribution in the example) and use the 20% trimmed mean
for the point estimator, which is calculated by the mean
function in R using the trim
optional argument
(on-line help).
Provide both a point estimate and a confidence interval for the actual
true MSE.
21.
Modify the percentile bootstrap confidence
interval example making two changes. Make the parameter to be estimated
the interquartile range of the population and the point estimator of this
parameter the interquartile range of the data, which is
calculated by the IQR
function in R
(on-line help).
Some web browsers don't display the math formulas above correctly. In this case you have two options.