LOESS (also called LOWESS) is a very complicated nonparametric
regression procedure is done by the R function
loess
(
on-line help)
or an older function lowess
(
on-line help) that does not work like the most other R regression
functions.
We would really need a whole semester to understand how nonparametric
regression works. Efron and Tibshirani (Section 7.3) make an attempt
to explain this, but it is very sketchy. We'll just treat
the loess function as a black box that does regression prediction.
When the fit was made using surface="interpolate" (the
default), predict.loess will not extrapolate
so points outside
an axis-aligned hypercube enclosing the original data will have
missing (NA) predictions and standard errors.
So that means we have to change the definition of surface,
wherever that is. Now looking back in the
on-line help for the function loess
we don't see anything
about surface but there is a link to the
on-line help for the function loess.control
from which we see that setting the control
argument to loess to loess.control(surface = "direct")
is what's needed.
If that seems clear as mud. It is. If you want to become an R expert,
this kind of grovelling in the documentation is part of the process.
Cubic smoothing splines with smoothing parameter chosen by
generalized cross-validation is another very complicated nonparametric
regression procedure, this one is done by the R function
gam
(
on-line help).
This procedure is perhaps even more complicated than LOESS.
At least its theory is more complicated.
So, as with LOESS, we'll just treat it as a black box that does
regression prediction.
Note that the curves gam are generally more wiggly
than the curves loess draws.
Note that pred.star is the predicted y values
corresponding to the originalx values, even though
the fit out.star corresponds to the bootstrap data
(x.star, y.star). Then y - pred.star
is the predicted residuals at the original x values.
The reason for this is explained in Section 17.6 in Efron and Tibshirani
Having calculated MSPE for both procedures, we plot it two ways.
First, we look at one on top of the other histograms with scales aligned,
so they are comparable.
Second, we look at a scatter plot. The line is the where the two
MSPE's are equal. Note that there are only a few points above the
line. Those correspond to the bootstrap samples where LOESS does
better than the smoothing spline (as measured by MSPE). So LOESS
is usually worse.
Bootstrap color smears aren't very much like confidence intervals, although
many people treat them as confidence intervals.
MSPE is the sum of three terms
error variance
prediction variance
prediction bias squared
The color smears only reflect the second term. They completely ignore the
bias term. (The first term is inherent in the problem and cannot be reduced
by better estimation). In fact, the more biased a procedure is
the better it looks on the color smears because there is usually
a bias-variance trade-off. Procedure parameters can be adjusted to lower
term 2 or to lower term 3, but not both at the same time. Term 2 can be
made nearly zero if one doesn't mind term 3 skyrocketting. The color smears
reflecting only term 2 is a good way to make this mistake.
As far as I know, good bootstrap confidence bands for nonparametric
regression is an open research problem (meaning nobody really knows how
to do that).
Bias in regression is an issue about conditional expectation.
Is the conditional expectation of the response variable given
the predictor variable equal to the conditional expectation
of the regression predictions given by the nonparametric regression
routine?
Since it is a question about the conditional model, we should bootstrap
residuals not cases.
Looking at the plot we see the black line (regression prediction
from the nonparametric routine) is biased. The blue line (the
bootstrap bias corrected estimate) is more wiggly. This is typical
of nonparametric regression. The estimates are always biased
and always less wiggly than the truth. They erode the peaks and
fill in the valleys says a catchphrase about this.
So why don't we use this bias-corrected estimate? Isn't bias bad
and bias correction good? No. If we compared mean square prediction
error for the procedure implemented by the routine an our supposedly
better bias-corrected predictions, we would find that decreasing the bias
increases mean square prediction error! There is a bias-variance
trade-off, and the routine R provides, choses the right spot on the
bias-variance trade-off curve (this is shown by one of the plots made
by the gam.check function).
So some bias is good! Attempting to be unbiased is the stupidest thing
you can do in nonparametric regression.
Because of the way nonparametric regression works, the more wiggly
the true regression curve, the more the bias. So once we decide the
true regression curve is more wiggly than we previously thought,
we have to conclude that the bias is also more than we now think.
Thus we should iterate the bias estimation.