Rules
No rules. This is practice.
Grades
No grades. This is practice.
Disclaimer
These practice problems are supplied without any guarantee that they will help you do the quiz problems. However, they were written after the quiz problems were written and with the intention that they would help.
These practice problems are also supplied without any guarantee that they are exactly or even nearly like the quiz problems. However, they are like at least some quiz problems in at least some respects.
Problem 1
Write an R function that, like the example in Section 8 of the course notes about computer arithmetic except that we want it to be for the zero-truncated Poisson distribution rather than the binomial distribution.
The zero-truncated Poisson distribution has parameter θ and data x and log likelihood and derivatives are given by
where, as always, exp denotes the exponential function and log the natural logarithm function (as in R). The data x is positive-integer-valued (1, 2, 3, …). The parameter θ is real-valued (− ∞ < θ < ∞).
Your function should have signature
logl(theta, x)and return a list having components
-
value
, the value of the function, -
gradient
, the first derivative of the function, and -
hessian
, the second derivative of the function.
Like the example in the course notes, the point is to be careful about computer arithmetic, avoiding overflow and catastrophic cancellation as much as possible.
For this problem, you do not have to check for invalid argument values.
Show your function working for parameter values
thetas <- seq(-10, 10)and for data values both
x <- 1and
x <- 5
Problem 2
Test your function for the preceding problem like the example in Section 8 of the course notes about computer arithmetic
To test with the function value compare with the values of the function
logl.too <- function(theta) dpois(x, exp(theta), log = TRUE) - ppois(0, exp(theta), lower.tail = FALSE, log = TRUE) + lfactorial(x)and to test the derivatives use numerical derivatives (you can also apply the other method using derivatives calculated by the R function
D
if you like, but that does not count).
Note that numerical differentiation does not give perfectly accurate derivatives. So there may be small discrepancies between the two methods of calculation.
Problem 3
(This problem is about finding errors in data, but that is too hard for a quiz, so we are just going to test some skills that are useful for that.)
We will use the data
foo <- read.csv("http://www.stat.umn.edu/geyer/s17/3701/data/p3p3.csv", stringsAsFactors = FALSE)(This reads in a data frame having variables
x
,
y
, and z
.
y
is zero-or-one-valued and we will call one success
and zero failure
.
z
is categorical.
In this problem we consider the answer better if it is done without using a loop.
- What is the success rate when
x
is less than or equal to 30? Greater than 30 and less than or equal to 50? Greater than 50? - Which level of
z
had the highest success rate? - Get the subset of the data for which
x
is greater than 40. - Which level of
z
had the highest success rate whenx
was greater than 40? - Order the levels of
z
by success rate whenx
was greater than 40, from highest to lowest.