Next: burg() Up: Time Series Macros Help Previous: autocov()   Contents


Keywords: spectrum analysis, frequency domain
Suppose a spectrum estimate is computed by circular convolution (see
MacAnova topic convolve()) of weights w[1], w[2], ..., w[nfreq] with a
periodogram computed from an untapered time series at nfreq equally
spaced frequences 0,1/nfreq, 2/nfreq, ..., (nfreq-1)/nfreq cycles per
delta_t where delta_t is the time interval between observations.

An important characteristic of the estimate is its bandwidth, the
effective frequency range over which appreciable smoothing occurs.  The
greater the bandwidth, the more stable (smaller variance) is the
resulting estimate, but the greater the potential for bias because of
the smoothing.  Conversely, the smaller the bandwidth, the smaller is
the bias, at the cost of increased variance of the estimate.

A common definition of the bandwidth associated with a periodogram
computed at nfreq frequencies and smoothed by w[i] is
          B = {sum(w[i])^2/sum(w[i]^2)}/(nfreq*delta_t).

This is in units of cycles per unit time.

When {w[i]} is a "box car" of length m, that is w[i] = 0 except for m
consecutive values of i for which w[i] is constant, then B =
m/(nfreq*delta_t).  In the common situation when nfreq ~= 2*N, where N
= nrows(y), B = .5*m/(N*delta_t).

A common type of smoothing weights are obtained by convolving
(non-circularly) a box car with itself P times.  This is sometimes
called the P-th convolution power of the box car.

For P = 2, the a graph of the weights is a triangle; for higher P the
graph becomes bell shaped.  For large P the graph approximates a normal
curve.  MacAnova macros spectrum(), crsspectrum() and compfa() all use
the 4-th convolution power of a box car as the smoothing weights.

When the weights are computed as the P-th convolution power of a box
car of length m, sum(w^2)/sum(w)^2 and sum(w)^2/sum(w^2) are as follows.
    P             sum(w^2)/sum(w)^2                  sum(w)^2/sum(w^2)
    1     1/m                                               m
    2     (2*m^3+m)/(3*m^4)                       1.5*m-.75/m+O(m^-3)
    3     (11*m^5+5*m^3+4*m)/(20*m^6)            1.818*m-.826/m+O(m^-3)
    4     (151*m^7+70*m^5+49*m^3+45*m)/(315*m^8) 2.086*m-.967/m+O(m^-3)

Thus, when nfreq ~= 2*N, the default smoothing using the 4th convolution
power of a boxcar smoother of length m, has bandwidth approximately
     B = (2.09*m - .97/m)/(nfreq*delta_t) ~= 1.043*m/(N*delta_t)

                  Equivalent degrees of freedom (EDF)
One common way to summarize the stability of a spectrum estimator is by
its equivalent degrees of freedom or EDF.  Large EDF means smaller
relative standard deviation and thus greater stability.

For a positive random variable W, EDF[W] = 2*E[W]^2/Var[W] = 2/CV[W]^2,
where CV[W] = SD[W]/E[W] is the coefficient of variation.

When W = k*X^2, where X^2 is distributed as chisq(f) (chi-squared on f
degrees of freedom), EDF[W] = f.

Even when W is not a multiple of chi-squared, its distribution may
often be well approximated by that of (E[W]/EDF[W])*chisq(EDF).

When the spectrum is sufficiently smooth, the approximate EDF of a
smoothed periodogram with bandwidth B at a frequency f between B/2 and
.5/deltat_t - B/2 is approximately 2*B*N*delta_t.  At 0 and .5/deltat_t
the EDF is B*N*deltat_t.  For the default smoother, EDF ~=
(4.17*m-1.93/m)*N.  Expressing m in terms of EDF you have the
approximate relationship m ~= EDF/4.17 + 1.93/EDF

                      Using a Taper or Data Window
In most situations it is desirable to "taper" the time series before
computing the periodogram.  If yr[i] = y[i] - muhat[i], where muhat[i]
is an estimate of E[y[i]], the tapered time series is h[i]*yr[i],
where (usually) h[i] tapers smoothly to 0 near i = 1 and i = N.  The
sequence h[1],...,h[N] is called a taper or a data window.

Tapering reduces "leakage" -- bias at a given frequency arising from
variation a distant frequencies.  However, it also reduces the EDF of a
smoothed periodogram by a factor of approximately
             R = sum(h^2)^2/{N*sum(h^4)} <= 1
thus increasing the variance.  The greater the amount of tapering, the
smaller is R.

Macro compfa(), which computes estimated spectra and, optionally, cross
spectra, uses a "cosine" taper which tapers approximately A*N
observations on each end, where 0 <= A <= .5.  For this taper, the
factor by which the EDF is reduced is

 R = (1 - 5*A/8)^2/(1 - 93*A/128) = 1 - 0.5234*A + 0.0103*A^2 + O(A^3).

Therefore, for the default smoothing, with 100*A percent tapering on
each end and nfreq ~= 2*N, EDF ~= (1 - 0.5234*A)*1.043*m*N.

See costaper() for details on the form of a cosine taper.

Gary Oehlert 2003-01-15