Statistics 5601 (Geyer, Spring 2006) Examples: Smoothers

Contents

General Instructions

To do each example, just click the "Submit" button. You do not have to type in any R instructions or specify a dataset. That's already done for you.

Notes

The handout for smoothing is available in Adobe PDF format. Paper copies were handed out in class. No need to print out another if you got one in class.

Running Mean Smoother

We use for example data the cholostyramine data from Section 7.3 in Efron and Tibshirani.

External Data Entry

Enter a dataset URL :

Comments

The R function ksmooth (on-line help) does simple smoothing. With kernel = "box", which is the default, it does a running mean smoother.

Note that this smooth is not actually very smooth. This is a property of the kernel not being smooth. In the next section we do better.

Try with different bandwidth values.

General Kernel Smoother

We use for example data the cholostyramine data from Section 7.3 in Efron and Tibshirani.

External Data Entry

Enter a dataset URL :

Comments

Again we use the R function ksmooth (on-line help), this time with kernel = "normal".

Try with different bandwidth values.

Local Polynomial Smoother

We use for example data the cholostyramine data from Section 7.3 in Efron and Tibshirani.

External Data Entry

Enter a dataset URL :

Comments

The R function locpoly (on-line help) does local polynomial smoothing.

Try with different bandwidth values.

The function locpoly is in the R package KernSmooth so you must do library(KernSmooth) before using it.

Smoothing Spline

We use for example data the cholostyramine data from Section 7.3 in Efron and Tibshirani.

External Data Entry

Enter a dataset URL :

Comments

The R function smooth.spline (on-line help) does spline smoothing.

Try with different bandwidth values.

One can specify the smoothing parameter by using the spar argument instead of the df argument. The former is the amount of penalty, the latter analogous to degrees of freedom (effective number of parameters) in the regression function.

One can omit any specification of the smoothing parameter. Then a supposedly optimal choice is made. See the web page about bandwidth selection for more on this.