Improving an Unstable Process

by J. Stuart Hunter

Take a look at Figure 1 and the data in Table 1. Each shows 50 successive batch deviations from a target where the target equals 100%, a theoretical maximum. The deviations clearly trend and suggest an unstable batch process.

The process is expensive to run, and the product is in great demand with every percentage yield worth a small fortune. Management insists there is neither time nor money for new equipment or for any action that risks losing product. Management has also declared the current low quality performance will not be tolerated and something must be done.

What would you do if you were the leader of the team charged with this problem? You'd likely begin by determining the usual statistics listed in Table 1. But for what purpose?

These statistics are appropriate for situations in which the process mean is fixed and the errors (noise) are characterized by random events from a normal distribution with a zero mean and a constant standard deviation. You might try a Shewhart run chart on the individual observations but change your mind when you realize the two available estimates of the standard deviation differ by a factor of two.

Frankly, those initial statistical efforts miss the point. Figure 1 suggests the mean is moving and the observations are autocorrelated. To check this assumption, you find the estimated first lag autocorrelation coefficient is rho(1) = 0.74, and the second, third and fourth lagged coefficients are 0.60, 0.53 and 0.45, respectively. These data are seriously autocorrelated and confirm your suspicions of a moving mean.

How should you proceed? Although you are not allowed to make large process changes, you might be able to make small ones. You call the standard way method A and the slightly different way method B, where B is a tweak--a small change in some process operation. You believe the consequence of the tweak will be small but beneficial. The difficulty is that any signal of a small change, good or bad, is likely to be lost in the motion and noise associated with the process.

One strategy would be to tweak the process to method B and never return to method A. After all, the tweak seems to be a good idea. Unfortunately, given the present process variability, after several such tweaks, your confusion will be as great as ever. You have merely added to the noise.

Another strategy would be to change to method B for several batches, return to method A for a similar number of batches and then compare the two averages using a statistical test of significance. But now sample size is an issue.

To detect with any confidence a shift in mean of one standard deviation requires at least 16 observations from B and 16 from A. This assumes the process will be stable over the testing period and the attendant errors will be independent. Given the process in Figure 1 (p. 68), any program requiring 16 runs will cover unsettled periods clearly violating the assumptions.

Evolutionary Operation

The strategy most likely to succeed is called evolutionary operation (EVOP), a proactive procedure proposed by George Box in 1957 that forces a process to produce useful information. The idea is to employ small and repeated changes until the best alternative is detected. This is then followed by another new sequence of small repeated changes. The process is thus evolved to better operating conditions.

The unusual thing about using the EVOP program here is that the experimental design is nothing more than a repeated square wave, the simple alternation of the two methods: method A on one batch and method B on the next. (Note: EVOP typically employs two or three factors simultaneously using simple factorial designs. Here EVOP is discussed in its simplest form.)

A one-variable-at-a-time program is easy to run and not a bad idea given the conditions of the problem. Each comparison of A vs. B uses a block of two sequential runs. Ideally, for each pair of batches, a coin would be flipped to determine which method to use first.

The difference, d, between the observations in A and B carries the information of interest. These separated differences will be much less influenced by a moving process and less autocorrelated than the original observations. Student's t test is used to test the hypothesis that the true difference, d, between the means of method A and method B equals zero. The t test is easily computed after each recorded difference as:

Once a computed value of t exceeds that of the associated critical t*, it is reasonable to assume process B is distinguishable from process A. In practice, it is a good idea to wait until four or five differences have been recorded before making any decisions. The t test is commonly one-sided, the only situation of interest being when B > A and t is positive. able 2 shows the successive critical values, t*, of t, that is, prob (t > t*) = 0.10 and 0.05. Additional values are available in any standard t table.

Attaining Statistical Control

Consider these two examples:

  1. Looking at the data in Table 1 (p. 68), you assign method A to the odd numbered runs and method B to the even runs. The first several differences, d = B ­ A, are computed from the separate blocks of two runs. Their associated t values are listed in Table 3. No computed t exceeds its corresponding critical value and thus no signal of improvement occurs. You are hardly surprised because the assignment of A to odd numbered runs and B to even numbered runs was arbitrary and the process was not changed.
  2. You change the data so method B always gives an increase in yield of d = 0.5, an increase of less than a standard deviation. The first several differences computed from these new data are given in Table 4 along with their computed t values.

In Table 4, evidence emerges that the hypothesis, d = 0, can be rejected and method B is significantly better than method A. At the seventh difference, the computed t = 1.53 exceeds the 10% critical t0.10 = 1.44. This signal is reinforced later by the eleventh difference where the computed t = 1.97 exceeds the 5% critical t0.05 = 1.82. The twelfth difference again reinforces the signal that B > A.

It is now apparent method B should be made the new standard of operation. The next step in EVOP is either to move on and investigate the consequence of a new one-factor small change or to introduce a more classical EVOP program involving a 22 factorial design in which two factors are simultaneously tweaked.

When an EVOP program is proposed, you may hear the argument that the process is too unstable or hectic and should be brought under
statistical control before any experimental procedure is tried. "Statistical control" implies the observations have, reasonably, a constant mean and standard deviation and associated independent errors.

It will be a long time before the batch observations illustrated in Figure 1 (p. 68) acquire these conditions. The problem you and your team face is to improve the process now. Fortunately, the differences obtained from successive blocks of two batches are much closer to the ideals of statistical control than the original observations. The EVOP program described here employs these easily obtained differences.

Box's EVOP works on the principle that a simple repeated change will divulge its consequence to an alert observer. By employing repeated small changes, EVOP compels the process to produce information about itself, not just the product and monitoring data. It is an essential statistical accompaniment to W. Edwards Deming's philosophy of never ending improvement.


  1. Box, George E.P., "Evolutionary Operation: A Method for Increasing Industrial Productivity," Applied Statistics, 1957, pp. 81-101.
  2. Box, George E.P., and Norman Draper, Evolutionary Operation: A Statistical Method for Process Improvement, John Wiley, 1969.
  3. Juran, J.M., and A. Blanton Godfrey, Juran's Quality Handbook, fifth edition, McGraw-Hill, 1999.

J. STUART HUNTER is a professor emeritus in the School of Engineering and Applied Science at Princeton University. He earned a doctorate in experimental statistics from North Carolina State University and is an Honorary Member of ASQ.

Average Rating


Out of 0 Ratings
Rate this article

Add Comments

View comments
Comments FAQ

Featured advertisers