Rules
See the Section about Rules for Quizzes and Homeworks on the General Info page.
Your work handed into Moodle should be a plain text file with R commands and comments that can be run to produce what you did. We do not take your word for what the output is. We run it ourselves.
Note: Plain text
specifically excludes
Microsoft Word native format (extension .docx
). If you have to
Word as your text editor, then save as and choose the format to be
Text (.txt)
or something like that. Then upload the
saved plain text file.
Note: Plain text
specifically excludes
PDF (Adobe Portable Document Format) (extension .pdf
). If you
use Sweave, knitr, or Rmarkdown, upload the source (extension .Rnw
or .Rmd
) not PDF or any other kind of output.
If you have questions about the quiz, ask them in the Moodle forum for this quiz. Here is the link for that https://ay16.moodle.umn.edu/mod/forum/view.php?id=1279315.
You must be in the classroom, Armory 202, while taking the quiz.
Quizzes must uploaded by the end of class (1:10). Moodle actually allows a few minutes after that. Here is the link for uploading the quiz https://ay16.moodle.umn.edu/mod/assign/view.php?id=1279330.
Homeworks must uploaded before midnight the day they are due. Here is the link for uploading the homework. https://ay16.moodle.umn.edu/mod/assign/view.php?id=1279355.
Quiz 3
Problem 1
Write an R function that, like the example in Section 8 of the course notes about computer arithmetic except that we want it to be for the geometric distribution rather than the binomial distribution.
The geometric distribution has parameter θ and data x and log likelihood
where, as always, exp denotes the exponential function and log the natural logarithmic function (as in R). The data x is nonnegative-integer-valued (0, 1, 2, …). The parameter θ is negative-real-valued (− ∞ < θ < 0).
Your function should have signature
logl(theta, x)and return a list having components
-
value
, the value of the function, -
gradient
, the first derivative of the function, and -
hessian
, the second derivative of the function.
Like the example in the course notes, the point is to be careful about computer arithmetic, avoiding overflow and catastrophic cancellation as much as possible.
For this problem, you do not have to check for invalid argument values.
Show your function working for parameter values
thetas <- (- 10^seq(3, - 3))and for data values both
x <- 0and
x <- 5
Problem 2
Test your function for the preceding problem like the example in Section 8 of the course notes about computer arithmetic
To test with the function value compare with the values of the function
logl.too <- function(theta) dgeom(x, 1 - exp(theta), log = TRUE)and to test the derivatives use numerical derivatives (you can also apply the other method using derivatives calculated by the R function
D
if you like, but that does not count;
we are only going to grade the comparison to numerical derivatives).
Note that numerical differentiation does not give perfectly accurate derivatives. So there may be small discrepancies between the two methods of calculation.
Problem 3
(This problem is about finding errors in data, but that is too hard for a quiz, so we are just going to test some skills that are usefull for that.)
We will use the data
foo <- read.csv("http://www.stat.umn.edu/geyer/s17/3701/data/q3p3.csv", stringsAsFactors = FALSE)(This reads in a data frame having variables
w
, x
,
y
, and z
). The categorical variables
are w
and z
.
In this problem we consider the answer better if it is done without using a loop.
- How many distinct values does
w
have? - If you wanted to fit a linear model that regresses
y
onw
,x
, andz
withz
being treated as categorical (likew
), what would you have have to do to make that work right? - Find the largest value of
y
for each value ofw
. - Find the second largest value of
y
for each value ofw
.