Due Date

Due Mon Dec 2, 2013.

First Problem

The file

http://www.stat.umn.edu/geyer/5601/mydata/gamma.txt

contains one variable x, which is a random sample from a gamma distribution.

The coefficient of skewness of a distribution is the third central moment divided by the cube of the standard deviation (this gives a dimensionless quantity, that is zero for any symmetric distribution, positive for distributions with long left tail, and negative for distributions with long right tail).

It can be calculated by the R function defined by

skew <- function(x) {
    xbar <- mean(x)
    mu2.hat <- mean((x - xbar)^2)
    mu3.hat <- mean((x - xbar)^3)
    mu3.hat / sqrt(mu2.hat)^3
}

  1. Find a 95% confidence interval for the true unknown population coefficient of skewness that is just the sample coefficient of skewness plus or minus 1.96 bootstrap standard errors.

  2. Find a 95% confidence interval for the true unknown population coefficient of skewness having the second order accuracy property using the boott function.

    Note: Since you have no idea about how to write an sdfun that will variance stabilize the coefficient of skewness, you will have to use one of the other two methods described on the bootstrap t page.

  3. Find a 95% confidence interval for the true unknown population coefficient of skewness using the bootstrap percentile method.

  4. Find a 95% confidence interval for the true unknown population coefficient of skewness using the BCa (alphabet soup, type 1) method.

  5. Find a 95% confidence interval for the true unknown population coefficient of skewness using the ABC (alphabet soup, type 2) method.

    Note: this will require that you write a quite different skew function, starting

    skew <- function(p, x) {
    
    (and you have to fill in the rest of the details, which should, I hope, be clear enough from the discussion of our ABC example).

Second Problem

The file

http://www.stat.umn.edu/geyer/5601/mydata/ar1.txt

contains one variable x, which is a random realization of an AR(1) time series.

The sample mean of the time series obeys the square root law, that is,

sqrt(n) * (theta.hat - theta)

is asymptotically normal, where theta.hat is the sample mean for sample size n and theta is the true unknown population mean.

  1. Find a 95% confidence interval for the true unknown population mean that is just the sample mean plus or minus 1.96 bootstrap standard errors.

  2. Find a 95% confidence interval for the true unknown population mean using the method of Politis and Romano described in the handout and on the second web page on subsampling.

Use subsample size 50 for both parts (you can use the same samples).

Third Problem

The file

http://www.stat.umn.edu/geyer/5601/mydata/big-one-third.txt

contains a vector x of data that are a random sample from a heavy tailed distribution such that the sample mean has rate of convergence n1 ⁄ 3, that is

n^(1 / 3) * (theta.hat - theta)

has nontrivial asymptotics (nontrivial here meaning it doesn't converge to zero in probability and also is bounded in probability, so n1 ⁄ 3 is the right rate) where theta.hat is the sample mean for sample size n and theta is the true unknown population mean.

This is the same distribution as was used for Homework 5, Problem 3, but a much larger sample.

  1. Find a 95% confidence interval for the true unknown population mean using the sample mean as the point estimator and using the subsampling bootstrap with subsample size b = 100 by the method of Politis and Romano using the known rate n^(1 / 3).

  2. Find a 95% confidence interval for the true unknown population using the sample mean as the point estimator and using the subsampling bootstrap with subsample size b = 100 by the method of Politis and Romano estimating the rate (pretending you have been told nothing about the rate). Report both your rate estimate and your confidence interval.

Answers

Answers in the back of the book are here.