Among the numerous design evaluation criteria for response surface designs is the IV-criterion that is based on the average prediction variance (APV). In this paper the various APV measures generated by computing packages and their relationships to the IV-criterion are summarized and critiqued. The computergenerated APV criteria require averaging over either a fixed set or a random set of evaluation points. When a fixed set of points is used, the APV measure will be larger than the corresponding IV-criterion value and could lead to the selection of an inferior design based on the IV-criterion. A simple approach that provides exact evaluation of the IV-criterion value for response surface designs on the hypercube is discussed.

Key Words: Average Prediction Variance, Integrated Prediction Variance.

*By* **JOHN J. BORKOWSKI, Montana State University,
Bozeman, MT 59717**

**Introduction**

IN many research projects, experiments are run to describe
relationships between design variables *x*_{1},
*x*_{2}, ..., *x _{k}* and the response

After considering temporal, economic,
and physical constraints, experimenters often use design optimality
criteria to evaluate a design prior to running a proposed
experiment. For a design having an expanded design matrix
X, the *D, A, E, G,* and *IV* design evaluation
criteria are based on properties of Dr. Borkowski is an Associate
Professor of Statistics in the Department of Mathematical
Sciences. His email address (X'X)-1. For a detailed discussion
of these five criteria, see Atkinson and Donev (1992, Chapter
10). In this paper, only the integrated average prediction
variance (or *IV*) criterion is studied. The *IV*-criterion
is based on the prediction variance

This paper is organized as follows.
First, the MPCA method is outlined along with a brief explanation
of PCA. Then the existing methods for predicting future observations
are reviewed and the proposed method is presented. Next, a
case study on a PVC batch process is described to demonstrate
the proposed method. Finally, the performance of the proposed
method is discussed, and concluding remarks are given. See
Box and Draper (1959), Myers (1971, Chapter 9), and Meyer
and Nachtsheim (1995) for discussions of this criterion. A
design which minimizes *IV* has been referred to as *IV*-optimal,
as well as *Q*-optimal (Myers and Montgomery (1995)),
*V*-optimal (Welch (1984), Atkinson (1988), and Atkinson
and Donev (1992)), and *I*-optimal (Haines (1987), Nachtsheim
(1987), and *SAS* (1995)).

While published academic research
may give exact *IV*-criterion values, statistical computing
packages do not. Many packages, however, do include an average
prediction variance (APV) measure in the software output.
Yet little or no documentation regarding their APV measures
is provided. For designs in the hypercube that are used to
fit the second-order model in Equation (1), the integration
to generate an exact evaluation of the *IV*-criterion
is not complicated because *V*(x) will be a quartic polynomial
in *k*-variables. It is important to note both the conspicuous
absence of exact *IV*-criterion values and the variety
of different APV measures adopted by different software packages.

Although some statisticians might
know that statistical packages provide APV measures, some
researchers and many practitioners in industry may not be
familiar with the specific criteria provided as output. For
those who believe that the measures provided by statistical
packages are approximations of the *IV*-criterion, it
is shown in this paper how inaccurate these measures can be
as approximations. It will be shown that (i) most of these
measures will be greater than the *IV*-criterion value,
(ii) the differences between these measures and the true *IV*-criterion
value can remain very large even when the approximation is
based on very large sets of points, and (iii) the measure
provided is dependent on which software package is used.

The common software approach is to
provide an APV measure that is the average of *V*(x)
over a subset of points in the design space. This is discussed
in the next two sections followed by the generation of exact
*IV* values using Matlab software (The Mathworks 2000).
For comparison purposes, four different types of composite
designs will be studied for 3, 4, and 5 design factors:

- The central composite designs (CCDs) were introduced
by Box and Wilson (1951). The CCDs to be studied are the
k-factor face-centered cube designs which consist of: (i)
points from a 2
^{k}factorial design for*k*= 3 or 4, or points from a 2^{5-1}Resolution*V*fractional factorial design for*k*= 5; (ii)*n*_{0}center points (0,..., 0); and (iii) 2*k*axial points (±1, 0,..., 0) ... (0,..., 0, ±1).

- The Plackett-Burman composite designs (PBCDs) (Draper
(1985) and Draper and Lin (1990)) are similar to the
*k*= 4 or 5 factor CCDs except that a 12-run Plackett-Burman design is used instead of a 2^{4}or a 2^{5-1}design.

- The small composite designs (SCDs) (Hartley (1959)) are
similar to the CCDs except that a resolution
*III** design is used when*k*= 3 or 4. In a resolution*III** design, the shortest word in the defining relation is of length 3, but there are no words of length 4. Two possible defining relations are*I*=*ABC*and*I*=*ABD*for*k*= 3 and 4, respectively.

- The Notz designs (Notz (1982)) consist of a subset of
the 2
^{k}factorial design plus a subset of points containing one or more zeros. If no center points are added, the Notz designs are saturated (i.e, there are no degrees of freedom for estimating the error variance once the model is fit).

In this study, one centerpoint CCDs, PBCDs, and SCDs and the saturated Notz designs are considered. For a discussion of response surface methodology and these designs, see Box and Draper (1987, Chapter 7), Myers and Montgomery (1995, Chapters 7 and 8), and Khuri and Cornell (1996, Chapter 4).

Read Full Article (PDF, 143 KB)