Recent advances in statistics have developed extremely general, yet
simple methods for estimating confidence intervals and other
information about the distributions of sample statistics, using a
resampling technique known as the bootstrap.^{5.7} The bootstrap method may be used in an
attempt to estimate any aspect of the distribution of any statistic
(Efron 1982:31). The technique generates a Monte Carlo approximation
to the non-parametric Maximum Likelihood Estimate of the statistic of
interest (p. 33). I will first discuss how the bootstrap method works,
and then show how I use it in the context of displaying sets of F1-F2
measurements.

The bootstrap resampling techniques works as follows: We are given a
sample S1 of N observations. Create a new sample of the same size by
randomly choosing N data points from S1, with replacement, where the
probability of choosing any data point at any time is constant, at
1/N. Then calculate the desired statistic using this new sample; this
is a single re-estimate of that statistic. Repeat this procedure some
large number of times (e.g., 200, 1000), and consider the distribution
of the resampled statistics.^{5.8}

The re-estimated statistics fall in a certain distribution, which may be viewed as a histogram. This histogram is an approximate estimate of the sampling distribution of the original statistic, and any function of that sampling distribution may be estimated by calculating the function on the basis of the histogram. For example, the standard deviation of the re-estimated means is a reliable estimate of the standard deviation of the mean (that is, the average amount of scatter inherent in the statistic, or the average distance away from the mean at which other similarly constructed means would occur). Another application is to estimate confidence intervals for a given statistic (Efron 1982:78,80), though this is less reliable, since the problem is more difficult, requiring more precise estimates of every detail of the distribution.

Bootstrap methods apply just as well to many-sample situations and to a variety of more complicated data structures''(Efron 1982:35). In the present situation, we are interested in estimating the sampling distribution of the mean of the set of F1,F2 measurements taken for a given phonological class. This is a two-sample problem (i.e., the measured data points are two-dimensional), to which the bootstrap applies. The sampling distribution of the mean amounts to all aspects of the distribution of a statistic, to which the bootstrap technique also applies. The result of applying the bootstrap technique is, again, a Monte Carlo approximation to the non-parametric Maximum Likelihood Estimate of the statistic of interest. Here, this means that the distribution of the re-estimated means generated by the bootstrap technique is an approximation of the sampling distribution of the mean.

One way to look at a distribution of bootstrapped re-estimates is to consider that the ``true'' value of the statistic could with equal probability be any one of the re-estimated values. Thus the distribution of re-estimated statistic shows the range within which the true statistic (given infinite similar data) might fall, and the density of re-estimates at any one point shows how likely the true statistic is to fall at that location. The true location of the statistic (given an infinite sample) is most likely to be located where the re-estimated statistics are most tightly clustered.

The bootstrap method can be used in many ways, but I will use it rather cautiously, simply as a method for displaying estimated means in F1-F2 space. A number of plots are given in later chapters which display the distribution of 200 or more bootstrapped re-estimates of the mean of a cloud of F1-F2 measurements. These bootstrapped mean distributions are to be interpreted as estimating the range within which the true mean (given an infinite sample of similar data) is located. These distributions are small clouds on the page; the true mean is no more precisely located, according to the given sample of data, than the area covered by that cloud. If two clouds do not overlap, or overlap by less than 5%, say, then the difference between the two categories is statistically significant.

This enables a visual evaluation of the significance of differences among measurements for various phonological categories. If all the clouds on the the page are non-overlapping, then they are pairwise statistically significantly different, and the relationships among the clouds provide a rough estimate of the relationships among the true means for the distributions from which the measured data was sampled (more precise estimates are the point-to-point differences among the original means which lie in the centers of each of the clouds).