Bootstrap
description | simple example | MAIA example | how it works | caveats
Description and a simple example: You give the bootstrap algorithm a set of values from a single sample. From that sample, you could calculate the mean and a confidence interval around the mean if you knew the data were normally distributed. But if you know that the underlying distribution of the variable of interest is not normally distributed, you may estimate a more accurate confidence interval for the mean using bootstrap resampling.
The bootstrap randomly resamples your original sample by choosing 1 value at a time and replacing the value so that some values may be drawn more than once and some not at all. This process of resampling is repeated until the appropriate number of values are selected, often this number is defined as the number in the original sample. For each bootstrap sample, the mean (or some other statistic) is calculated. This process is repeated 100's of times and the multiple estimates of the mean are used to define the confidence interval around the mean.
MAIA example: Klemm et al. 2002 used a bootstrap to resample invertebrate stream samples to assess the variability of multimetric index values. Resampling was done at the level of the individual invertebrates and the index calculated for each bootstrap sample. After 500 repetitions they had a bootstrap distribution for the index and they could define the confidence interval associated with an index value. They found that the length of the confidence interval, that is, the variability of the index, was higher for invertebrate samples with less than 200 individuals.
Figure
Figure: Confidence interval lengths for multimetric index values were higher when the number of invertebrates collected was small.
How the method works: If the original sample was random and independent, the values associated with that sample provide the best estimate of values for the larger population. The bootstrap creates random samples from the initial random sample.
Assumptions/limitations: The bootstrap assumes that the original sample was truly random and independent.
![[logo] US EPA](http://www.epa.gov/epafiles/images/logo_epaseal.gif)