## 2019

STATISTICS ROUNDTABLE

# When Should You Consider A Split-Plot Design?

**by Christine M. Anderson-Cook**

If you conduct experiments, a good understanding of design of experiments (DoE) can be beneficial for maximizing the information you can obtain on a fixed budget.

A split-plot experiment is one option to consider, especially if you have a situation in which some factors have levels more difficult to change than others. In these experiments, some of the factors (the hard-to-change ones) are intentionally reset less often than the easy-to-change factors. In a completely randomized experiment, all factors are reset an equal number of times.

### Split-Plot Basics

Despite their utility, split-plot designs are frequently given only superficial coverage in an introductory statistics or DoE course. For that reason, some background is warranted.

Since split-plots originated in agricultural settings, much of the nomenclature refers to plots of land: The hard-to-change factors are called whole-plot factors, while the easier-to-change factors are called subplot factors.

Imagine an agricultural field experiment in which a number of adjacent plots of land all receive the same fertilizer (the whole-plot factor) and then the particular seed type (subplot factor) is randomized and applied to plots within that larger grouping. The number of whole plots (#WP) corresponds to the number of times you reset the whole plot factors, while the total number of the experiments (N) is equal to the number of subplots, since you reset the subplot factors after each run.

In an industrial setting, the hard-to-change or whole plot factors might be changing the temperature of an oven (which then needs to reach an equilibrium temperature before being used), setting up complex equipment or mixing up batches of ingredients. The metric for quantifying hard-to-change factors could be based on budget or time.

### More Trouble Than It’s Worth?

While these designs might seem much more complicated than a completely randomized design, there are a number of implementation and statistical advantages to running this type of experiment.

Consider this simple example: Suppose you’re interested in studying the effects of three factors (X1, X2 and X3) on a response through a 23 full factorial screening experiment. This design involves eight runs, which are listed in standard order in the first three columns of Table 1.

Recall that for a completely randomized designed (CRD) experiment, you randomize the order of the eight runs (a potential run order is shown in the fourth column of Table 1). Then between each run, the levels of each factor are reset.

For practical purposes, the experimenter might sometimes fail to reset all of the factors between runs. For example, between runs 1 (+1, +1 and +1) and 2 (+1, -1 and -1), the experimenter might just keep X1 set at its current level since it does not change between the two runs. Between runs 7 (-1, -1 and +1) and 8 (-1, -1 and -1), the experimenter might not reset X1 and X2 since they remain at the same levels.

This is referred to as an inadvertent split-plot.^{1}
In this case, a split-plot experiment has actually been run, but
when it is analyzed it might be mistakenly treated as a CRD. This
analysis would treat all the runs as independent, which is not
appropriate and could lead to misleading conclusions about factor
effects. The results could either erroneously declare a factor
statistically significant or not significant when it actually is
important.

### Order in the Experiment

If the experimenter does not wish to reset all of the factors between runs, it makes more sense to intentionally choose a desirable split-plot experiment in which the same eight runs might be used, but their order has been carefully considered. This would not mean that you omit the randomization step, but rather that the form of the randomization would be more structured.

Of course, it will also be important to analyze the data as a split-plot experiment. This will lead to valid conclusions that give accurate information about the effects. The new releases of analytical packages from JMP and Design Expert make selecting a good split-plot experiment and performing the appropriate analysis much easier.

The correct analysis acknowledges that observations obtained without resetting the whole plot factor are correlated with each other, as you would expect them to be more similar than observations that involved resetting all the factors.

When determining the run order of the split-plot experiment, there are two separate randomizations: determine the order in which you run the whole plots; and collect observations within each whole plot.

This leads to two error terms in the model, which affect not only how the analysis should be performed, but also how different designs compare depending on the relative size of the error terms. The good news is that many of the classical design choices are quite robust to differentiate the relative size of the two error terms.

Researchers considered the example discussed
earlier and examined how likely an experimenter would be to
obtain a reasonable design in terms of good statistical
properties using the inadvertent split-plot
approach.^{2}

Depending on whether estimation (D-optimality) or prediction (I- or G-optimality) are important, the inadvertent split-plot can yield a design that is only 50% as efficient as the best choice of groupings of observations into whole plots. Clearly, this arbitrary “leave it to chance” approach has high potential to disappoint.

Sometimes it is possible to outperform the estimation and
prediction performance of a CRD by choosing to do the right
split-plot experiment. In many situations, a split-plot
experiment can yield more precise model parameter estimates than
the same sized experiment run as a CRD.^{3}

For example, for the one hard-to-change and two easy-to-change factor experiments discussed earlier, the design with the highest D-efficiency is the one shown in Table 2. Again, it would be important to randomize the run order of the four whole plots, and then the order of the two runs within each whole plot.

### Weighing the Options

If you take the cost of the experiment into consideration, it might be even easier to justify this type of experiment. If the hard-to-change factors are expensive or time-consuming to change, then selecting a design that contains fewer whole plots could represent a potentially large savings.

There are approaches for quantifying the relative cost of
experiments in which the total cost is assumed to be proportional
to a weighted sum of the number of whole-plots (#WP), and the
total number of observations (N): C } #WP + rN in which r =
CSP/CWP is the relative cost of changing a subplot factor
compared to changing a whole plot factor.^{4,5} Included
in the cost of the subplot factor is the cost of measuring the
responses for each observation.

For example, an r value of 0.5 would mean that the cost of changing the subfactor level setting would be half of the cost of changing the whole plot factor level. Often, it is difficult to estimate the actual cost of these two changes, but their relative costs might be more easily approximated.

The earlier cost based metric can be used in combination with the quality of estimation (D-criterion) or prediction (IV- or G- criteria) to compare potential designs. Sometimes the hard-to-change factors are extremely difficult or expensive to change relative to reseting the subplot factors (here r ' 0). In this case, adding more observations within each whole plot is relatively cheap, and the total cost of the experiment is driven almost entirely by the number of times that the hard-to-change factors need to be reset, namely the number of whole plots.

In other cases, the relative difference between the cost of changing the hard-to-change and easier-to-change factors is more moderate (for example, r between 0.1 and 1). In this case, the total cost of the experiment is a weighted function of the number of resets of both of these factors.

Clearly, different designs should be considered depending on the relative costs of changing the factors. While combining the cost and quality of the design into a single metric is not always appropriate, it frequently might allow for more realistic comparisons and help facilitate decision making.

### The Right Design

As with CRDs, there are a number of different aspects that
should be considered when choosing a good design. There is a
breadth of aspects—from qualitative characteristics (such
as balance for the number of observations per whole plot and the
number of levels of each factor) to quantitative measures (such
as good estimation of the whole plot and subplot error terms, and
relative performance based on optimality
criteria).^{6}

Since different experiments have different priorities, it is
important to consider what “optimal” means for an
experiment and to focus on the criteria most relevant to project
goals. There are classes of split-plot designs called equivalent
designs that allow for estimation of parameters using ordinary
least squares instead of restricted maximum likelihood
estimation.^{7}

One of the primary advantages of these designs is that the parameters of the mean model can be estimated separately from the whole-plot and subplot error terms, which are frequently not easy to estimate precisely.

Split-plot designs are an important, practical class of designs. When strategically chosen, split-plot designs can boost the amount of information a practitioner can extract from a designed experiment.

### REFERENCES

- Jitendra Ganju and J.M. Lucas, “Detecting Randomization Restrictions Caused by Factors,” Journal of Statistical Planning and Inference, Vol. 81, 1999, pp. 129-140.
- Li Liang, C.M. Anderson-Cook and T.J. Robinson, “Cost Penalized Estimation and Prediction Evaluation for Split-Plot Design,” Quality and Reliability Engineering International, Vol. 23, No. 5, 2007, pp. 577-596.
- Peter Goos and Martina Vandebroek, “Outperforming Completely Randomized Designs,” Journal of Quality Technology, Vol. 36, No. 1, 2004 pp. 12-26.
- Søren Bisgaard, “The Design and Analysis of 2k--px2 q--r Split-Plot Experiments,” Journal of Quality Technology, Vol. 32, No. 1, 2000, pp. 39-56.
- Li Liang, C.M. Anderson-Cook and T.J. Robinson, “Cost Penalized Estimation and Prediction Evaluation for Split-Plot Design,” see reference 2.
- P.A. Parker, C.M. Anderson-Cook, T.J. Robinson and Li Liang, “Robust Split-Plot Designs,” Quality and Reliability Engineering International, Vol. 23, 2007.
- P.A. Parker, S.M. Kowalski and G.G. Vining, “Classes of Split-Plot Response Surface Designs for Equivalent Estimation,” Quality and Reliability Engineering International, Vol. 22, 2006, pp. 291-305.

### ADDITIONAL RESOURCES

Montgomery, D.C., Design and Analysis of Experiments, Wiley, 2006.

Goos, Peter, The Optimal Design of Blocked and Split-Plot Experiments, Springer, 2002.

**CHRISTINE M. ANDERSON-COOK** is a technical
staff member of Los Alamos National Laboratory in Los Alamos, NM.
She earned a doctorate in statistics from the University of
Waterloo in Ontario, Canada. Anderson-Cook is a senior member of
ASQ.

Featured advertisers