## Breakdown Point

The breakdown point of an estimator (or a related procedure like a test or a confidence interval) is the fraction of the data that can be complete junk without destroying the estimator. More precisely, it is the fraction of data that can be dragged to infinity with the estimator remaining bounded.

Breakdown point is one measure of robustness.

High breakdown point means highly robust (which is good).

For the one-sample location estimators we have the following

estimator breakdown point
sample mean 0%
sample median of Walsh averages 29.3%
sample median 50%

## Efficiency

The asymptotic relative efficiency (ARE) of two estimators is the ratio of sample sizes needed to get equal accuracy. It is inversely proportional to the ratio of asymptotic variances. Associated tests and confidence intervals have the same ARE as the estimators.

Efficiency depends on the the true unknown distribution of the data. Thus we never know in practice what the efficiency is.

One interesting family of distributions to consider consists of the Student t distributions and their limit as the degrees of freedom go to infinity, which is the normal distribution.

The table below gives the ARE for various estimators and various true population distributions against the maximum likelihood estimator, which is the most efficient asymptotically.

Normal t(30) t(20) t(10) t(5) t(4) t(3) t(2) t(1) 1.000 0.993 0.986 0.945 0.800 0.700 0.500 0.000 0.000 0.955 0.975 0.983 0.996 0.993 0.981 0.950 0.867 0.608 0.637 0.666 0.680 0.716 0.769 0.788 0.811 0.833 0.811

Another interesting comparison is to choose population distributions for which the various estimators are fully efficient.

• The sample mean is fully efficient (it is the MLE) for the normal distribution.
• The sample median is fully efficient (it is the MLE) for the Laplace (also called double exponential) distribution.
• The Hodges-Lehmann estimator (sample median of Walsh averages) is fully efficient for the logistic distribution. Interestingly, it is not the MLE, although asymptotically equivalent to the MLE (almost the same for large sample sizes).
• The hyperbolic secant distribution is thrown in as an interesting example where the sample mean and median are equally efficient (neither fully efficient).

We won't give the exact formulas (the curious can follow the links above) rather just note that the normal distribution has very light tails [proportional to exp(− x2 / 2)] and the other three have the same moderately light tails [proportional to exp(- |x|)]. So the differences among the other three are not tail behavior but the precise details of their densities.

Normal Laplace Logistic Hyperbolic Secant 1.000 0.500 0.912 0.811 0.955 0.750 1.000 0.986 0.637 1.000 0.750 0.811

The Rweb below graphs the densities of the distributions in the table above.