University of Minnesota, Twin Cities School of Statistics Stat 8701 Rweb
Problem 1 Problem 2 Problem 3 Problem 4 Problem 5
This assignment will be handed in as a regular paper assignment,
although you can hand in
code by giving a URL in the assignment
or a location in the filesystem of our linux boxes (either the department
linux workstation network for students in our department or
grunard.stat.umn.edu
for the three students with accounts
on that.
For the programming problems, you should hand in enough so that it is clear exactly what you did.
The rules for grading this stuff is that only perfect is passing. If I find anything wrong, you have to debug the problem and hand in a fix (for that problem or part of a problem). And the process will be repeated until everything is correct. Thus everyone will eventually do every assignment perfectly (if they keep working at it).
You are not in competition with each other for grades. There's no reason why everyone shouldn't get an A. If you don't get A, that's because you either failed to complete some assignments or took some huge number of iterations to complete some. (I've never tried this form of grading before, so I can't be more specific. After a few assignments I can say more.)
This assignment is due (initially) Friday, February 8, 2002.
Problem 2.10 in Robert and Casella, except ignore the part about the
Kiss
algorithm. Just implement completely in R using the
default random number generator and an implementation of the
generalized inverse CDF
of the distribution of interest.
For example, if gicdf
is an implementation of the
generalized inverse CDF that works properly on vector arguments,
then
x <- gicdf(runif(n))produces a vector of length
n
of random variates having
the distribution of interest. Try to implement each of the examples
this way.
The extreme value distribution (also called Fisher-Tippett distribution) has the CDF
The Pareto distribution has the CDF
The arcsine distribution has the CDF
Write a C function myrunif
and a corresponding R function
does the same as the R function runif
. That is, your function
should have the prototype
void myrunif(double *x, int *n)and each call should set
x
to be a vector of n[0]
uniform random numbers, so that, after you have dynamically loaded
myrunif.so
, the R commands
n <- 7 invisible(runif(1)) save.seed <- .Random.seed runif(n) print(.Random.seed) .Random.seed <- save.seed .C("myrunif", x = double(n), n = as.integer(n))$x print(.Random.seed)should produce exactly the same results.
Section 5.3 in the Writing R Extensions book tells how to do this.
Problems 2.12 and 2.13 in Robert and Casella. For the small experiment
mentioned in 2.13(b), there is really no point unless both algorithms are
coded in C. So do that. Write both algorithms as C just like you did
for Problem 2.
Use the R
system.time
function to time your routines.
Use a larger enough sample size that the running time is dominated by
the time of the random variate generation (the start-up and clean-up
overhead is negligible).
Problem 2.32(b) in Robert and Casella. This one can be done efficiently in R.
The R function
chol
computes the Choleski factorization of a symmetric,
positive definite, square matrix.
The R operator
%*%
does matrix multiplication.
In case the problem statement is still not clear, the problem is
how to define a matrix A
so that
z <- rnorm(n) x <- A %*% zproduces a multivariate normal random vector
x
with a
specified covariance matrix M
. That is how do you produce
A
given M
?
Problem 2.38 in Robert and Casella. No computing here, just theory.
In case the problem statement is not clear