The purpose of experimental design technology is to build knowledge
by Lynne B. Hare
ExperientalL design may not be what you think it is. Design is an amazing business tool. It is not a tool solely for finding settings to produce an optimal response. It was never invented to aid in the assessment of p-values associated with factors under study. Designing an experiment also is not the same as looking up standard designs in a catalog or downloading designs from software.
Questions such as "Which design should I use?" miss the point. Designs selected purposefully to enable experimenters to play the winner—forsaking all knowledge of modeling and inference—do more damage than good.
From the outset, experimental design, developed and formalized by Ronald A. Fisher1 has been a technique—some might say strategy—for research and exploration. No one should expect that applying a specific design will answer all research questions. Instead, it is the sequential application of designs, each possibly increasing in sophistication, that provides a breakthrough. The breakthrough comes more from the resultant building of knowledge than from efforts devoted to finding the sweet spot or optimal response.
Starting in the late 1950s, Horace P. Andrews used his diagram, shown in Figure 1, to drive home the point of knowledge building.2 Design strategy begins with some basic knowledge and combines it with new ideas leading to conjecture and a tailor-made design. Next, the experiment is run, and data are analyzed and interpreted to elicit further hypotheses. Then, these steps are repeated.
This iterative shifting between induction and deduction builds knowledge, leading to the objective.3
Naturally, during the early practice of tailor making designs, design catalogs were formed. Catalogs saved practitioners the trouble of generating designs all over again. One early, useful and comprehensive catalog is interspersed in the pages of Experimental Designs.4 Others followed and without a doubt, design developers expanded on their designs based on present practicality to related potential situations such as those involving more treatment combinations, more factor levels and additional constraints.
In more recent decades, statistical software made cataloged designs available for download, and now the newer feature of computer-generated (algorithmic) designs, as introduced by Robert W. Kennard and L.A. Stone5 as early as 1969, are common offerings. This is good news all around, but there are pitfalls.
Some experimenters seeking quick-and-easy solutions will select catalog designs or download algorithmic designs without enough thought to their features, advantages and disadvantages. Those who are uncertain should seek statistical assistance.
All of this begs the question of what constitutes a good design. Andrews assembled an eight-point list of design criteria6 to aid the sleepiest students, myself included, in improved design evaluation. Good experimental designs:
- Provide unbiased estimates of process variable and treatment effects.
- Provide the precision necessary to enable the experimenter to detect important differences.
- Include the determination of the plan for the analysis and reporting of the results.
- Generate results that are free from ambiguous interpretation.
- Generate data that permit the experimenter to estimate the effects of treatments and their uncertainty.
- Permit conclusions that have a wide range of applicability.
- Point the experimenter in the direction of improvement.
- Are as simple as possible while satisfying the objectives of the experiment.
In practice, the answer to what constitutes a good design is contextual. Following the guidance of Figure 1 leads us to understand that in the early stages of experimentation, much work lies ahead, and we would be wise to conserve resources by limiting initial investigations to "screening" studies to isolate the most important effects.7
Still, experimenters must be prepared to answer the boss’ question: "If you were to do this again, will you get the same results?" A data-driven answer must be supported by some type of replication. Otherwise, there is no confirmation of results.
The next design phase could be more screening or characterizing, carried out to identify and quantify interactions or coupled effects in which the impact of one factor depends on the level of another factor. The choice of designs, naturally, depends on the outcome of the screening phase. How many factors show promise? Will the resources of time and money permit more screening?
So design, actually, is always a compromise. As knowledge builds, so does design sophistication, leading to the study of fewer factors at more levels each and to the use of more sophisticated displays of results. Ask your local friendly statistician about response surfaces if this concept is new to you.
One certainty is that its purpose is always to build knowledge. During a recent design session, chemists considered the use of alternative solvents for the preparation step of moisture determination. After we spent some time defining the problem, the chemists imposed upper bounds on the proportions of two of the solvents.
When asked for an explanation of their choice of upper bounds, they said that in the real world, they would never go higher than those bounds—costs of routine use would be prohibitive. The resulting design8 had some unfavorable characteristics because high correlations among the solvents indicated that the experimental design candidate would not provide clear estimates of the effects of the solvents. What to do?
There is no chemical or physical reason why the solvent proportions should be tightly constrained, and as it turned out, redesigning to relax the constraints slightly increased the clarity of the effect on estimates. That, of course, made the chemists nervous until they understood that modeling the resulting data would enable them to estimate solvent results at the previously desired upper bounds.
The point that had gone missing in the chemists’ minds is that the design is created to build knowledge, not necessarily to isolate and identify a fixed solution to the problem by selecting one among many design points.
Modeling all of the data resulting from the design that meets the Andrews criteria noted earlier provides predictions throughout the design space. In addition, the estimate of the true response derived from the fitted model will have lower uncertainty than that associated with any given observation.
Making design happen
Design cannot be forced, experimental treatment combinations eliminated, factors disregarded by prejudging and results biased by the loudest voice in the room. Those who seek hasty solutions must be made to understand the scientific method of inquiry: Good design depends on good science—not office politics—so the organizational stage must be set for it. That stage requires an established culture of data-driven decisions.
A good start comes from organizationwide awareness education—with briefings all the way to the top of the organization chart so vice presidents, directors and managers will not stifle good scientific efforts by thinking design is a fad that will someday disappear. Instead, top brass must conduct periodic reviews to ensure healthy growth in the data-driven culture, keeping it alive by recognizing and rewarding progress.9
That’s just good business.
The author thanks J. Richard Trout for his helpful suggestions for this column.
References and Note
- Ronald A. Fisher, The Design of Experiments, Oliver and Boyd, 1935.
- Ronald D. Snee, Lynne B. Hare and J. Richard Trout, eds., Experiments in Industry, ASQ Quality Press, 1985.
- More detail about the strategy can be read in the opening chapter of George E.P. Box, J. Stuart Hunter and William G. Hunter’s Statistics for Experimenters (John Wiley & Sons, 2005). Andrews collaborated with these authors to spread design technology across the United States during the 1950s and 1960s.
- William G. Cochran and Gertrude M. Cox, Experimental Designs, second edition, John Wiley & Sons, 1957.
- Robert W. Kennard, and L.A. Stone, "Computer Aided Design of Experiments," Technometrics, Vol. 11, No. 1, 1969, pp. 137-148.
- Snee, Experiments in Industry, see reference 2, p. 6.
- Lynne B. Hare and Mark Vandeven, "Drudgery to Strategy—A Statistical Metamorphosis," Quality Progress, August 2009, pp. 58-59.
- Lynne B. Hare, "Painting by the Numbers," Quality Progress, September 2016, pp 50-52.
- Lynne B. Hare, "The Hidden Laboratory," Quality Progress, August 2006, pp. 82-83.
Lynne B. Hare is a statistical consultant. He holds a doctorate in statistics from Rutgers University in New Brunswick, NJ. He is past chairman of the ASQ Statistics Division and a fellow of ASQ and the American Statistical Association.