Rules
See the Section about Rules for Quizzes and Homeworks on the General Info page.
Your work handed into Canvas should be an Rmarkdown file with text and code chunks that can be run to produce what you did. We do not take your word for what the output is. We may run it ourselves. But we also want the output.
Homeworks must uploaded before midnight the day they are due. Here is the link for uploading this homework. https://canvas.umn.edu/courses/330843/assignments/2864751.
Each homework includes the preceding quiz. You may either redo the quiz questions for homework or not redo them if you are satisfied with your quiz answers. In either case the quiz questions also count as homework questions (so quiz questions count twice, once on the quiz and once on the homework, whether redone or not). If you don't submit anything for problems 1–3 (the quiz questions), then we assume you liked the answers you already submitted.
Quiz 5
Problem 1
The following R command
foo <- read.table(url("https://www.stat.umn.edu/geyer/3701/data/2022/q5p1.txt"), header = TRUE)assigns one R object
foo
,
which is a dataframe containing one variable x
.
We assume these data are independent and identically distributed from
the logistic location family. See the help for R function dlogis
for a description of this family. We assume the default value (1) for
the scale parameter.
- Find the MLE for these data. Good starting points (root-n-consistent estimators) for this model include both the sample mean and the sample median (also any trimmed mean).
- Produce a large-sample approximate 95% confidence interval the true unknown location parameter from these data using the usual theory of asymptotics of maximum likelihood.
Problem 2
This problem continues where the preceding problem left off. We use the same data and the same statistical model.
Also produce a large-sample approximate 95% confidence interval for the parameter that is a level set of the log likelihood, following Section 5.4.4 of the course notes on models, part II.
Problem 3
The following R command
load(url("https://www.stat.umn.edu/geyer/3701/data/2022/q5p3.rda"))loads one R object
f
, which is a function having one argument,
which is a four-dimensional numeric vector. The value of this function
is numeric scalar.
The problem is to minimize this function.
This function has been deliberately constructed to have multiple local
minima. So use R function optim
with method = "SANN"
to try to minimize it
(Section 7.1.2.3
of the notes on optimization).
Since this method uses random search, use R function set.seed
to get repeatability as you work on this problem.
Use R function optim
method = "SANN"
ten
different times with default arguments of the control parameters starting
at zero (in four-dimensional space). Do you always get the same answer?
What does this tell you about method SANN?
Homework 5
Problem 4
This problem continues where Problem 3 left off. Minimize the same
function f
used in Problem 3. Again use R function
optim
method = "SANN"
. But this time read
the help for R function optim
, especially
- the part of
the Details section about method SANN that describes how the
temperature
changes over time and what thattemperature
does in the algorithm (controls the step size) and - the part of
the Details section about the
control
argument that says how the variables in thetemperature
change function can be specified, the componentsmaxit
,temp
, andtmax
being relevant to method SANN.
Note that the temperature
at iteration t is a function
of all of these parameters control components together. So adjusting any
one without adjusting some of the others makes no sense. The idea is to
start at a high (but not too high) temperature
and end (when
t = maxit
) at a low (but not too low) temperature
but how high is too high and how low is too low is problem specific.
But you want the final temperature
(when
t = maxit
) to be many times lower than
when t = 1.
Try to get one run that starts at zero (in 4-dimensional space) and
finds a solution at least as low as any found in problem 3. Do a run
that lasts at least 10 minutes. You can use R function system.time
to tell how long it takes. To get full credit, you do not have to actually
find the global optimum (we don't even know what that is). The point is
to use the control
argument in a reasonable way.
Problem 5
minimize: | x2 + y4 + z6 + sin(x + y + z) |
subject to: | x2 + y2 + z2 ≥ 4 |
Problem 6
This is the same problem as problem 5 except that you are required to supply functions that calculate the derivatives of the objective function (the function to minimize) and the constraint function (the function required to be ≥ 0) to the R function doing the minimization.
Also these derivatives must be analytic derivatives (not calculated by the
R package numDeriv
or any other R code that does derivatives
by finite differences). You may use the R functions D
or deriv
to calculate these derivatives, but I just did
them by hand.
Problem 7
This problem continues where problem 1 left off. We use the same statistical model: logistic scale family.
But now we are going to do a simulation study comparing the following four estimators:
- the maximum likelihood estimator, what was used in problem 1,
- the sample mean,
- the sample median,
- a 10% trimmed mean, what R function
mean
with optional argumenttrim = 0.1
evaluates.
Do a simulation study comparing these for estimators like the simulation study in Section 5 of the notes on simulation. You do not have to make any plots, just find
- the mean square error (estimated from the simulations) of each of the four estimators and
- the Monte Carlo standard error of each of the quantities in the previous item.
Use data sample size and simulation sample size
n <- 10 nsim <- 1e5(the former the same as in the course notes but the latter 10 times larger).
Note: since this is a location family, any location parameter will give
the same mean square errors as any other. You can use location parameter
equal to zero (the default) for R function rlogis
.
Note: use the principle of common random numbers as the example in the course notes does.
Note: you do not have to make a nice R markdown table like the course notes do. Just putting the numbers in a matrix with labels and showing the matrix will do.