What’s Meant by ‘Capability’?
Careful sampling, data analyses required to estimate process capability
by Lynne B. Hare
Time and again, you hear the question: "Is this process capable?" "Of what?" I want to ask. It hasn’t learned any new moves since the Macarena, so I’m not sure.
OK, very funny. But we really need to know whether it can produce product within specification. Cpk is 1.27. Is that good enough?
Well, that might be the last question. The first might be: "What do you mean by capability?"
In business parlance, process capability is often referred to as entitlement. That is, if an owner or manager purchases a piece of equipment, he or she is "entitled" to receive its most consistent results. That terminology may suffice in the board room, but it doesn’t really capture the essence of the concept or aid practical application.
Admittedly, the concept of process capability is a bit nebulous, subject to various interpretations. I think of it as the inherent, intrinsic variation of the process. It is the variation that the process exhibits when it runs at its absolute best. There are no exterior disturbances, no haphazard shocks to the process, no tweaks and no adjustments, illicit or otherwise; only pure random variation.
Assessing process capability
Assessing process capability is no easy chore. Some textbooks teach users to wait until the process reaches equilibrium, take roughly 30 samples and calculate their standard deviation. I have some problems with that advice. How might you know when the process reaches a state of equilibrium? How do you know that the recommended samples are representative of the process, much less truly representative of process capability? I believe the measurement of process capability is more complicated than that.
For example, suppose you have a rotary tablet press that produces 30 tablets, one from each of 30 pockets per rotation. Let’s say you’re interested in tablet thickness. You might want to base your estimate of process capability on the standard deviation calculated from 30 consecutive tablets. Better yet, you might assure representation by taking those 30 consecutive tablets repeatedly over, say, eight time periods spaced evenly throughout a production run. See Table 1. You would pool the eight individual standard deviations yielding a thickness capability estimate based on (8 X (30 - 1)) = 232 degrees of freedom.
For greater assurance yet, you might want to include several production runs with perhaps fewer sampling times per production run. The point is that estimates of the process capability made this way would be representative and independent of process mean changes that might take place from one sampling time to the next.
Because the pooled, within-group standard deviation is calculated on observations taken close together in time, there is no opportunity for it to be contaminated by assignable sources of variation. It is as close to pure capability as you’re likely to get.
A purist may want to drill down even further. What is the variation experienced among repeated tablet volumes from each of the 30 pockets? You could measure that by sampling 60 consecutive tablets and pair tablet one with 31, two with 32 and so on to measure the within-pocket variation. Isn’t the resulting variation also a component of process capability? Of course it is.
For initial studies, you should probably follow the purist’s advice until you are convinced that the variation among repeated observations within each pocket is very small. After this is established, practicality steps in to recommend that you follow the technique described earlier and let within-pocket variation be assigned to the capability estimate as a random component. A check to guard against blunders caused by avoiding the within-pocket detail can be made through careful examination of control charts.
After you have the right amount of the right kind of data, you can use analysis of variance (ANOVA) to obtain the residual variation. If you have chosen the right ANOVA model, this is the pooled, within-group variance. Its square root is the process capability standard deviation. And it is remarkably useful and powerful, just as it stands, because it is unencumbered by specifications, the process mean and reporting necessities.
You can use the capability standard deviation to help calculate the percentage out of specification, assuming a process on target or centered at some other strategic location. You also can use it as a baseline to learn of opportunities for improvement. More about that later.
At some point, you might want to calculate a quality index, such as Cpk, for reporting purposes and for report simplification. But some words of caution are in order:
- Be sure your specifications are based on product functionality and are not arbitrarily chosen. If they are arbitrary, some will want to expand them to make the capability index look better.
- Capability indexes only make sense if the process is stable; that is, if the mean does not wander and if the data set does not contain outliers. Use a Shewhart chart for the mean and standard deviation for verification.
- The distribution of the underlying data should be normal or nearly so. Use a formal test of normality and a normal probability plot.
- Always, always, always—without exception—plot the data and look at the plot.
- Be sure to have at least 100 observations and preferably 200 or more before you quote quality indexes. Bear in mind that quality indexes are statistics; they are estimates of the true quality index, and they are highly variable. Therefore, they are not reliable for small sample sizes. Some software will automatically generate a confidence interval about Cpk. If you don’t see them, you can calculate the confidence limits using
is the standard normal deviate marking off 100 * (1 -- á/2)%
of the area in each tail of the normal distribution,
n is the total number of observations, and
in which U and L are upper and lower specification limits, respectfully, is the grand mean and sc is the capability standard deviation estimate.1
The notion of reporting a confidence interval about a quality index has come to the forefront recently because of the dubious practice of quoting these indexes even when they are based on small sample sizes. Some government officials are recommending that the lower limit of the confidence interval be reported in place of the estimate itself.2 This is a safeguard against skimping on sampling and against errors of both kinds; accepting processes actually in need of attention and condemning those that are capable of performing well. Let those in highly regulated industries be warned.
The measurement of process performance is a bit more complicated. Suppose you have the same tablet press as described earlier. To measure the thickness variation experienced by the consumer, you would want to ensure representation of usual production. Sampling within one batch alone is not adequate. Instead, you should have upwards of 20 or 30 batches in the sample. In that case, you would carry out a variance components analysis, combining the within and between batch variance components to form the estimate of the performance variance.
Naturally, the variation due to process capability alone should be much smaller than the specification range. For example, if the response of interest is normally distributed, its process capability standard deviation should be smaller than one-sixth of the specification range. Otherwise, the process may not be capable of producing the specified product.
Performance variation, on the other hand, is a measure of the variation experienced by the consumer or end user. It includes capability variation along with all the other sources of variation such as environmental changes, variation induced by various raw material vendors and batches, and unauthorized process adjustments. Ideally, you want performance variation to be as close to capability variation as possible.
But the difference between performance and capability should be a major subject of attention: If the difference is large, it is likely that the process is hemorrhaging money. This may be true even when all production is within specification because departure from process capability may result in production line inefficiencies. And, of course, any efforts made toward bringing performance closer to capability should be undertaken with financial justification in mind.
So is this process capable? It takes some careful, unbiased sampling and data analysis to find out.
- A.F. Bissell, "How Reliable is Your Capability Index?"Applied Statistics, Vol. 39, 1991, pp. 331-340.
- Sidney S. Lewis, "Process Capability Estimates From Small Samples," Quality Engineering, Vol. 3, No. 3, 1991.
Hare, Lynne B., "Chicken Soup for Processes," Quality
Progress, August 2001, pp. 76-79.
Hare, Lynne B., "The Ubiquitous Cpk," Quality Progress, January 2007, pp. 72-73.
The author wishes to thank Mark Vandeven for his review of this column.
Lynne B. Hare is a statistical consultant. He holds a doctorate in statistics from Rutgers University in New Brunswick, NJ. He is past chair of the ASQ Statistics Division and a fellow of both ASQ and the American Statistical Association.