3.4 PER MILLION
It’s a Marathon, Not a Sprint
Building a successful statistical model takes place in stages
by Joseph D. Conklin
Six Sigma practitioners like their successes swift, large and final. Nature and circumstance, however, are rarely that kind. Normally, success is secured one step at a time. Given enough persistence and flexibility, putting these steps together can create a sizable and significant victory in the war against defects.
This point applies to the use of statistical models, popular items in the Six Sigma toolkit. Statistical models are equations derived through statistical techniques that relate the outputs of a process to its important inputs. Constructed properly, they are frequently useful for diagnosis and improvement. In any given application, however, the first statistical model is rarely the last or best one.
Consider the case of a metal deposition process in which an important alloy is layered on a metal substrate in an acid bath. For simplicity, assume a single layer.1 The key output is the thickness of the deposit. A minimum thickness is required to ensure the proper performance of the component the substrate is eventually inserted into.
A specially chartered team of engineers and technicians is studying the process, with the goals of reducing variation and fine tuning the key input levels for best effect. The initial strategy includes using linear regression2 to build a statistical model based on using historical process data.3 In the first attempt, the team relates the thickness of the deposit to the three inputs of pH, catalyst concentration and tank pressure. The data for the first-stage model are the 50 observations in Table 1.4
Running the data through the regression program of the MS Excel data analysis tool pack yields the data in Table 2.5 The values in the coefficient column (except for the intercept) are multiplied by the inputs to predict the output. In equation form, our first stage statistical model is thickness = 70.9653 + (0.2350*pH) + (0.2959*catalyst) – (0.0875*pressure). We are looking for coefficients that are statistically significant.
A coefficient is statistically significant if it’s reasonable to conclude its true value is something other than zero. A coefficient with value zero renders the associated input variable meaningless for affecting the output.
The column in Table 2 labeled standard error measures the uncertainty in the estimate of the true value of the corresponding coefficient. We prefer low values for the standard error—the lower the value the more certain we are.6
The column labeled t statistic is the result of dividing coefficient by standard error. This puts the measurement of the uncertainty of all the coefficients on the same scale. Comparisons between coefficients become easy to make. Under the standard assumptions of the linear regression technique, the t statistic results follow a well-known statistical distribution of the same name.7
The column labeled p value helps us conclude whether a coefficient is statistically significant. In that case, the p value result will be close to zero. The possible results for p value range from zero to one. There is no universal rule to say how close to zero the p value has to be before concluding a coefficient is significant.
The greater the consequences of a wrong decision, in general, the closer to zero the p value must be. Many possible cut-off values exist. In practice, the cut-off value is almost always less than or equal to 0.10. We will use 0.10 as the cut-off value for this example. If the value for the p value is less than 0.10 for a given row, we will conclude the coefficient is statistically significant.8 By this standard, the statistically significant inputs from stage one are catalyst and pressure.
Watch for signs
The signs of the significant inputs are
informative. They tell the team, other things being equal, that an increase in catalyst is associated with a positive increase in thickness. A decrease in pressure is also associated with an increase in thickness.9
The positive coefficient for catalyst makes sense to the team. To the limit that the bath can absorb it, more catalyst speeds up the reaction in the tank, and that leads to a faster deposition rate. The team members are puzzled to see no significant coefficient for pH. The theory of the process says this should have a positive effect on thickness. The significant coefficient on pressure was expected, but the sign is in the wrong direction. The team expects higher pressure to be associated with higher thickness, not lower.
Because the first-round model leaves some questions unanswered, the team proceeds to a second round. The team decided to expand the number of variables by two: tank temperature and tank voltage. Theory and experience suggest these ought to be important. A new data set of 50 observations is collected. The values appear in Table 3. A second run through the MS Excel regression program produces the coefficients shown in Table 4.
In equation form, the second stage statistical model is thickness = 4.1963 + (0.0814*pH) + (0.1636*catalyst) – (0.0438*pressure) – (0.4042*temperature) + (0.4245*voltage). All the input variables except pH have significant coefficients.
The lack of statistical significance for pH bothers the team. Voltage is statistically significant and positive—the results conform with theory and practice. It is no surprise to see pressure still significant and temperature enter the model as significant, but the individual signs seem wrong. Increased pressure or temperature should be associated with increased thickness.
The model confirms some expectations but raises new questions or leaves some old questions unanswered. This is not unusual when building statistical models.
The team debates what to do in the next stage. Some members propose that pressure and temperature do not act as independent variables.
They tend to move in the same direction: Higher pressure is associated with higher temperature, and lower pressure is associated with lower temperature. In other words, temperature and pressure are correlated.
Including correlated inputs in a statistical model can lead to confusing coefficients. The team wonders whether the joint effect of temperature and pressure explains changes in thickness.
An easy way to check this is to expand the model with a new term that is the product of pressure and thickness. The data for stage two are augmented with this cross product. The values for the cross product are shown in Table 5. Fitting a third-stage model with the augmented data set produces coefficients shown in Table 6.
In equation form, the third-stage statistical model is thickness = 107.3334 + (0.0854*pH) + (0.1538*catalyst) – (0.8878*pressure) – (1.6257*temperature) + (0.4275*voltage) + (0.0100*pressure*temperature). In the third stage, all the inputs that were significant in the second are also significant. The coefficient for the product of pressure and temperature is positive and significant. When the joint effect of pressure and temperature is accounted for in the model, the impact on thickness is what theory and experience predicts should happen.
The lack of significance for pH remains a mystery. Perhaps this variable is not significant because of some special but yet unobserved feature of the production environment. Maybe some variable missing from the model overrides or counteracts the effect of pH. Perhaps the effect of pH is through interaction with another variable. Maybe pH has a statistically significant effect in the third stage model, but it is too small compared to the measurement error for thickness.
At the end of the third stage, the team has some unanswered questions but also some clues about how to proceed. A designed experiment involving a more complex model capable of examining a wider array of possible interactions between inputs looks appealing.10
Depending on the importance attached to clearing up the mystery of the nonsignificance of pH, a more precise system for measuring thickness may be needed.
The team should put forth its most reasoned proposals for stage four, pursue the needed resources and, as with most other teams in the model building business, expect a little more of the truth to reveal itself on the gradual path to ultimate success.
The tale of overnight, complete victory will occur on TV or in the movies before it happens on the production floor.
References and Notes
- In practice, a process will have more than one important output. Improving and optimizing processes with multiple outputs is beyond the scope of this article. For details on methods for optimizing multiple outputs, see Raymond H. Myers and Douglas C. Montgomery’s Response Surface Methodology: Process and Product Optimization Using Designed Experiments, second edition, John Wiley, 2002.
- Linear regression and its variants are a common and powerful technique for building statistical models. For more details, see Norman R. Draper and Harry Smith’s Applied Regression Analysis, third edition, John Wiley, 1998.
- If the process is in the early design or pilot stages, the source of the historical data may be a designed experiment. For more details, see Douglas C. Montgomery’s Design and Analysis of Experiments, sixth edition, John Wiley, 2005. If the process has been running for some time, the historical data may reside in the plant’s quality information system. In this case, the success of the model building exercise depends on how well quality practices—effective operator training, regular equipment maintenance, clearly written procedures—are carried out.
- The data in Tables 1, 3 and 5 are coded.
- The table is reformatted from the original layout for clarity of presentation.
- For a precise definition and formula for standard errors of regression coefficients, see Draper and Smith’s Applied Regression Analysis (see note 2).
- For more details on t distribution, see Rudolph J. Freund and William J. Wilson’s Statistical Analysis, second edition, Academic Press, 2002.
- The value of the cutoff should be decided in advance of estimating the statistical model.
- The coefficient of an input variable in a statistical model measures the change to be expected in the output variable per unit change in the input, leaving all other input variables unchanged.
- For examples, see the Myers and Montgomery, Response Surface Methodology: Process and Product Optimization Using Designed Experiment (see note 1).
Joseph D. Conklin is a mathematical statistician at the U.S. Department of Energy in Washington, D.C. He earned a master’s degree in statistics from Virginia Tech and is a senior member of ASQ. Conklin is an ASQ certified quality manager, quality engineer, quality auditor and reliability engineer.