Uncertainties and Statistics

Reference: For a more extensive treatment see the online manual at < Uncertainties.html >

Variations in measurements

Suppose that you make several measurements of a physical quantity such as the period of a pendulum. You are not likely to get the same answer each time. Two main reasons for this are

1. The quantity being measured is somewhat variable. (E.g your height will vary depending on whether you stand straight and it is different when you get out of bed than when you have been standing all day, your mass varies as you breath in and out.)
2. The methods used for the measurement introduce some variation. (E.g. in measuring the pendulum period you rely on your visual observation to decide when to start and when to stop the watch.)

The measurement of the variations leads to the field of statistics. We will try to take many measurements of a quantity and use the average as a best estimate, but we also want an idea of how far the measurements were from the average. The terms error, standard deviation, and uncertainty are all used to describe the variation.

Random versus Systematic error

Suppose I measure the length of a wood block using a metal ruler in the following locations: Bogota, Mexico City, Nashville, Rochester, and Nome. I might get results that suggest that the wood block is longer when the country speaks English than when it speaks Spanish. In fact there is a .systematic error relating to the temperature and the change in the length of the ruler in the different locations. Systematic errors occur when an uncontrolled (unmeasured) variable affects the data so that the values are always too large or too small. A famous example is the newspaper that polled voters and declared that Dewey had beaten Truman in the 1948 presidential election. What was the systematic error? We will not discuss systematic errors any further.

Random errors cause the measurements to be centered on the average with equal numbers above and below. If we plot a histogram of our measurements we get the well known Gaussian curve, also called a normal curve, or a bell-shaped curve.

Precision and Accuracy

Accuracy deals with how well the center of the curve matches the "real value" of what we are measuring. Accurate measurements have no systematic error. Suppose that the thickness of a piece of foam is supposed to be 4.48 cm. The diagram below shows two measurements of the foam, and in both cases the Gaussian curves are centered on 4.45 cm, meaning that the measurements are quite accurate.

The top curve represents measurements with a wider spread of values than the bottom curve. We say that the bottom curve is a more precise measurement of the thickness.

Standard Deviation

The Gaussian is asymptotic to the axis (infinitely wide). We need a method to specify the relative widths of the curves. The standard deviation, sx or s, is a measure of the width of the curves. The horizontal line shows the standard deviation (from the center to where the line crosses the curve.)

We can represent the curve by a shaded box. It is darkest in the middle where most measurements occur, and fades out to zero as we go away from the center.

Approximately 2/3 of the measurements lie within 1 standard deviation of the center. If we go two standard deviations out, 95% of the measurements are accounted for.

Comparing numbers.

Suppose we have four measurements:

A (3.8 ± 0.5) cm, B (5.1 ± 0.5) cm, C (3.8 ± 0.2) cm, D (5.1 ± 0.3) cm. These are shown in the diagram to the right.

If we don't have the standard deviations we can only say that the measurements are close.

With standard deviations we can say that A and B are equal within uncertainties. That is within 2s, A ranges from 2.8 to 4.8 cm overlapping B which varies from 4.1 to 6.1 cm.

However within 2s, C and D are not equal, that is C = 3.4 to 4.2 cm does not overlap D, 4.5 to 5.7 cm.