New Uncertainty Method Is Taking Hold

Refinements to traditional approach

by Philip Stein

If all measurements were without error, there would be very little for metrologists to do. The study of metrology (measurement science) is almost completely the study of uncertainty, how to avoid it, how to minimize it and how to quantify it.

In order to be complete, the expression of a measured quantity must include three items: a number representing the quantity, a unit of measure and a statement of the uncertainty associated with the number. That is a good beginning, but the nature of the statement of uncertainty--what's in it, how to express it and how to calculate the uncertainty in the first place--is of intense interest to metrologists.

Even more important, it must now be of equal interest to everyone in the quality field and most other engineering and scientific professionals as well. One requirement of the ISO 9000 series of quality system standards is to "identify all inspection, measuring and test equipment that can affect product quality and calibrate and adjust them at prescribed intervals, or prior to use, against certified equipment having a known valid relationship to internationally or nationally recognized standards."

That "known, valid relationship" is called traceability and requires all three of the following conditions to be met:

*There must be a connection to national or international standards.

*The connection must consist of an unbroken chain of comparisons.

*Each comparison must include stated uncertainties.

So calibrations of measuring equipment that affect quality must be traceable, and therefore most quality practitioners need at least a basic understanding of how uncertainty is stated. Even more important, the international metrology community has recently changed the accepted method of stating uncertainty, adding to the confusion.

In 1993, a consortium of European agencies, including the International Organization for Standards (ISO), published the "Guide to the Expression of Uncertainty in Measurement" (known here as the GUM and restated as an American National Standard ANSI/NCSL Z540-21). The guide breaks no fundamental new ground, although it does redefine some of our more traditional thinking in new ways.

How it was and still is for some

In the past, we described measurement errors as being of two types, based on a view that the errors originated from two different mech- anisms. The two types of errors were described as systematic errors, or bias, and random errors, or noise.

Indeed, this still is a useful description because many measurement systems behave in this way. The results obtained have some variability from point to point (as do all measurements). The short-term variation of these results is seen as noise. If you take a long-term average of the results, you get a stable value, or centerline, that may be offset, or biased, from the "correct" value. In most cases we don't really know the correct value, but through calibration and statistical studies we can estimate a true value and compare it to the average of the measurement values.

The real usefulness of this distinction is that often these errors arise from different mechanisms within the measurement system, and by separating them in the data, we give ourselves strong clues as to how to find the errors and improve the system.

This particular view, though, can also be misleading. That explains, at least in part, why the international metrology community and the new international standards are giving up this distinction.

I first truly understood why bias and noise can be inappropriate terms when studying some work by Raghu Kacker of the National Institute of Science and Technology and Jack Wood of Ford. They were calculating the economic trade-offs of frequent calibration of an air gage used to measure a wrist pin bore in a piston. They observed what appeared to them to be a clear drift, a slow change in bias, superimposed on some noise like variation. They had to stop their experiment after a few hours, but when I looked at their graphs, my intuition told me that the drift was unlikely to go on in the same direction for very long but would be more likely to turn around and drift the other way for a while. In other words, I was postulating that this drift was in fact very, very slow random noise.

In fact, the difference among bias, drift and noise is in our heads, where we make little models to try to explain what's going on. Nature, of course, doesn't care about our models. We determine by experience and training what it means for variation to be fast, called noise, or slow, called drift, or very slow, called bias. If the noise is slow enough, we can recalibrate the instrument and eliminate most of it.

What's different now?

In a word, everything. The GUM separates measurement uncertainty into two classes:

* Type A uncertainties, defined as "those which are evaluated by statistical methods."

* Type B uncertainties, defined as "those which are evaluated by other means."

These are two large categories, so let me give you some simple examples. First, imagine weighing a one kilogram box of steel nails on a two pan balance, using a set of brass weights. You repeat the measurement several times in order to determine statistically the part of the error arising from an inability to repeat exactly.

The Type A uncertainty may be calculated (as the standard deviation) from the variation among the many "replicates" (repeats) of the same measurement.

The Type B uncertainty comes from the difference between the weight set and national standard weights, from the difference in density (and therefore buoyancy) between brass and steel, from any consistent asymmetry in the balance and from other sources the metrologist in charge may choose to include. Type B's are also in the form of a standard deviation. Note that inconsistent problems with the balance, changes in the buoyancy correction due to changes in weather and so on may show up as Type A uncertainty because they may change during the experiment.

The real reason we're interested in separating and allocating uncertainty to Types A and B is that the way to reduce or eliminate these uncertainties is different depending on which type they are. Many Type B uncertainties, for example, can be eliminated by calibration (just correct for the known, calibrated difference from the national standard, but be sure to retain the uncertainty of that calibration), or by correction (calculate and apply the buoyancy correction). Type A uncertainties are not stable from measurement to measurement, although their distribution may remain constant and can only be reduced by averaging repeated measurements or by redesign of the measuring system.

Working the system

In order to conform to the requirements of the GUM, a measurement must include an uncertainty statement based on these two types, A and B. A full uncertainty analysis states the origin of each source of uncertainty (on a separate line, or even a separate page) and tells how it was measured or calculated. These explanations can range from brief to very wordy, but they should clearly communicate to the reader the details of what was done in order to arrive at the stated value.

At this point we must emphasize a statistical detail that can in some cases be very important but that is usually neglected by us and most other practitioners and that we will neglect here.

The individual sources of uncertainty must be uncorrelated, that is the variation in one of them must not be related to variation in another. If, say, atmospheric pressure affects the buoyancy correction in the above example, it can't also affect another stated source of error such as scale bearing friction (in this example it probably won't). The GUM handles the case of correlated uncertainties with full statistical rigor, but most of the time we just think hard about this issue and decide it won't be a problem. If it looks like it will be a problem, it's time to call in some senior statisticians and metrologists.

Next, the Type A and Type B are separately combined (since they are standard deviations, they should be added root-sum-square2, or RSS, if they are uncorrelated). This yields two numbers called ua and ub. Then RSS these two numbers to yield a final result, uc , the combined uncertainty.

The combined uncertainty is also one standard deviation of the total variation. Often, though, measurement error is stated in terms of an error band or tolerance. Taking a leaf from the concept of the confidence interval, we often state a measurement value as something like

Y = 3.6 volts ± 0.03 volts.

This is an alternative form for stating a value with its uncertainty. It is called U, the expanded uncertainty. U is calculated by multiplying uc by a coverage factor k, and then quoting it as a plus or minus number. The most usual value for k is 2, and this would correspond to a confidence interval of roughly 95%. The GUM is careful to point out that the exact value of confidence for k = 2 should not be stated because many of the statistical assumptions underlying a highly precise confidence level, such as normality, are unlikely to be exactly correct. Just say 95% for k = 2, 99% for k = 3.

When the expanded uncertainty is stated, the coverage factor k must be included. This acts as a little flag telling us that the number was calculated according to the GUM, then expanded. If k is not stated, and there are no words telling us that the value is a correctly calculated uncertainty, we should assume that it is a tolerance calculated some other unknown way.

Well, that's the new uncertainty method in broad brushstrokes. It's a bit awkward because it's so different from the way we have done business in the past, but we'd better get used to it because pretty soon it will be everywhere.


1. Available through www.ncsl-hq.org. You can also download a shortened version from the NIST uncertainty home page, which is found at physics.nist.gov/cuu/Uncertainty/index.html.

2. That is, you square the numerical value of each of the uncertainties, add those squares, then take the square root of the sum as your final answer.

PHILIP STEIN is a metrology and quality consultant in private practice in Pennington, NJ. He holds a master's degree in measurement science from George Washington University, in Washington, and is a Fellow of ASQ. For more information, go to www.measurement.com.

Average Rating


Out of 0 Ratings
Rate this article

Add Comments

View comments
Comments FAQ

Featured advertisers