A Comparison of Prediction Variance Criteria for Response Surface Designsy - ASQ

A Comparison of Prediction Variance Criteria for Response Surface Designs

Among the numerous design evaluation criteria for response surface designs is the IV-criterion that is based on the average prediction variance (APV). In this paper the various APV measures generated by computing packages and their relationships to the IV-criterion are summarized and critiqued. The computergenerated APV criteria require averaging over either a fixed set or a random set of evaluation points. When a fixed set of points is used, the APV measure will be larger than the corresponding IV-criterion value and could lead to the selection of an inferior design based on the IV-criterion. A simple approach that provides exact evaluation of the IV-criterion value for response surface designs on the hypercube is discussed.

Key Words: Average Prediction Variance, Integrated Prediction Variance.

By JOHN J. BORKOWSKI, Montana State University, Bozeman, MT 59717


IN many research projects, experiments are run to describe relationships between design variables x1, x2, ..., xk and the response y of interest. In many industrial situations, a response surface design is implemented that will enable the experimenter to fit the second-order model given by

An N-point response surface design can be represented by an N × k design matrix. Each of the N rows corresponds to an experimental run or point, and the k column entries correspond to the experimental settings of the k design variables. This design is expanded into an N × p expanded design matrix X, where p is the number of parameters in the model of interest. For the model in Equation (1), we have p = (k + 2)(k + 1)/2.

After considering temporal, economic, and physical constraints, experimenters often use design optimality criteria to evaluate a design prior to running a proposed experiment. For a design having an expanded design matrix X, the D, A, E, G, and IV design evaluation criteria are based on properties of Dr. Borkowski is an Associate Professor of Statistics in the Department of Mathematical Sciences. His email address (X'X)-1. For a detailed discussion of these five criteria, see Atkinson and Donev (1992, Chapter 10). In this paper, only the integrated average prediction variance (or IV) criterion is studied. The IV-criterion is based on the prediction variance

where is a vector of p realvalued functions of (x1,..., xk) based on the model terms. For the second order model in Equation (1), . To study the IV-criterion for an N-point design, we will average the scaled prediction variance

over all points x in a space of interest X. The average of V(x) is defined as the IV-criterion value, denoted by IV, where

and A is the volume of the space X. In this paper, X represents the design space where the minimum and maximum levels of each of the k design variables are scaled to be ±1. Thus, X is restricted to the k-dimensional hypercube [-1, 1]k for which A = 2k. The IV-criterion in Equation (2) can also be written in the alternative form:

This paper is organized as follows. First, the MPCA method is outlined along with a brief explanation of PCA. Then the existing methods for predicting future observations are reviewed and the proposed method is presented. Next, a case study on a PVC batch process is described to demonstrate the proposed method. Finally, the performance of the proposed method is discussed, and concluding remarks are given. See Box and Draper (1959), Myers (1971, Chapter 9), and Meyer and Nachtsheim (1995) for discussions of this criterion. A design which minimizes IV has been referred to as IV-optimal, as well as Q-optimal (Myers and Montgomery (1995)), V-optimal (Welch (1984), Atkinson (1988), and Atkinson and Donev (1992)), and I-optimal (Haines (1987), Nachtsheim (1987), and SAS (1995)).

While published academic research may give exact IV-criterion values, statistical computing packages do not. Many packages, however, do include an average prediction variance (APV) measure in the software output. Yet little or no documentation regarding their APV measures is provided. For designs in the hypercube that are used to fit the second-order model in Equation (1), the integration to generate an exact evaluation of the IV-criterion is not complicated because V(x) will be a quartic polynomial in k-variables. It is important to note both the conspicuous absence of exact IV-criterion values and the variety of different APV measures adopted by different software packages.

Although some statisticians might know that statistical packages provide APV measures, some researchers and many practitioners in industry may not be familiar with the specific criteria provided as output. For those who believe that the measures provided by statistical packages are approximations of the IV-criterion, it is shown in this paper how inaccurate these measures can be as approximations. It will be shown that (i) most of these measures will be greater than the IV-criterion value, (ii) the differences between these measures and the true IV-criterion value can remain very large even when the approximation is based on very large sets of points, and (iii) the measure provided is dependent on which software package is used.

The common software approach is to provide an APV measure that is the average of V(x) over a subset of points in the design space. This is discussed in the next two sections followed by the generation of exact IV values using Matlab software (The Mathworks 2000). For comparison purposes, four different types of composite designs will be studied for 3, 4, and 5 design factors:

  1. The central composite designs (CCDs) were introduced by Box and Wilson (1951). The CCDs to be studied are the k-factor face-centered cube designs which consist of: (i) points from a 2k factorial design for k = 3 or 4, or points from a 25-1 Resolution V fractional factorial design for k = 5; (ii) n0 center points (0,..., 0); and (iii) 2k axial points (±1, 0,..., 0) ... (0,..., 0, ±1).

  2. The Plackett-Burman composite designs (PBCDs) (Draper (1985) and Draper and Lin (1990)) are similar to the k = 4 or 5 factor CCDs except that a 12-run Plackett-Burman design is used instead of a 24 or a 25-1 design.

  3. The small composite designs (SCDs) (Hartley (1959)) are similar to the CCDs except that a resolution III* design is used when k = 3 or 4. In a resolution III* design, the shortest word in the defining relation is of length 3, but there are no words of length 4. Two possible defining relations are I = ABC and I = ABD for k = 3 and 4, respectively.

  4. The Notz designs (Notz (1982)) consist of a subset of the 2k factorial design plus a subset of points containing one or more zeros. If no center points are added, the Notz designs are saturated (i.e, there are no degrees of freedom for estimating the error variance once the model is fit).

In this study, one centerpoint CCDs, PBCDs, and SCDs and the saturated Notz designs are considered. For a discussion of response surface methodology and these designs, see Box and Draper (1987, Chapter 7), Myers and Montgomery (1995, Chapters 7 and 8), and Khuri and Cornell (1996, Chapter 4).

Return to top

Read Full Article (PDF, 143 KB)

Download All Articles

Featured advertisers

ASQ is a global community of people passionate about quality, who use the tools, their ideas and expertise to make our world work better. ASQ: The Global Voice of Quality.