Separating Location and Dispersion Effects in Unreplicated Fractional Factorial Designs - ASQ

Separating Location and Dispersion Effects in Unreplicated Fractional Factorial Designs

Fractional factorial designs are often used in industry to screen potentially important factors in terms of their impact on the mean response, that is, their location effect. To save resources, these experiments are often unreplicated, in that only one observation is made at each design point. In addition to studying location effects, these designs may also be used to study dispersion effects. One of the difficulties in studying dispersion in these unreplicated designs is that dispersion effects are confounded with pairs of location effects. In this paper, we investigate the situation where a single dispersion effect is identified. We develop a procedure to determine the minimum number of additional runs needed to totally remove the confounding between this dispersion effect and two location effects. For situations where replication is not feasible, we develop a test for a pair of unidentified location effects conditioned on a single dispersion effect being detected. An example from the literature is used to demonstrate these two procedures.

Key Words: Confounding, Dispersion, Fractional Factorial Designs, Variance.

By RICHARD N. McGRATH, Bowling Green State University, Bowling Green, OH 43403

Introduction

SCREENING experiments are used in industry to identify the few factors that have the largest impact on a response. An assumption of effect sparsity is often made, where it is assumed that only a few factors produce substantial effects and the rest are null or of negligible comparative magnitude. In this paper, we analyze unreplicated 2k-p screening designs where k factors, each at two levels, (and possibly interactions among them) are studied in n = 2k-p carefully selected design points. See, for example, Box, Hunter, and Hunter (1978) and Montgomery (2001) for discussions of the construction of these designs. To identify both location and dispersion effects in these designs, the usual practice is to identify location effects first, then study the residuals from this model to identify dispersion effects. However, as McGrath and Lin (2001a) showed, a pair of unidentified location effects can a.ect the size of a dispersion effect estimate. Therefore, there is location-dispersion confounding. In a later section, we develop an algorithm to determine the minimum number of additional runs needed to remove this location-dispersion confounding. We also develop a formal procedure to test for the presence of unidentified location effects when a dispersion effect has been detected.

Assume that , or, equivalently, , i = 1,..., n. The n n effect matrix X has columns xj = (x1j, x2j, . . . , xnj)´, j = 0,..., n - 1 representing n - 1 effects (k main effects and possibly some interactions among the k factors) in addition to the intercept. Here, is an n 1 vector of unknown location parameters. The xij are coded such that the two levels are denoted by -1 and +1. Typically, ordinary least squares analysis is used to obtain the location effect estimates , where, at least temporarily, it is assumed that , i = 1,..., n. Under the effect sparsity assumption, most of the js are 0 or negligible compared to the few "active" ones. From the n - 1 estimates (excluding the intercept, 0), the active ones are identified using methods such as normal probability plotting (Daniel (1959, 1978)) or the method of Lenth (1989), both based on normality, i.e., . See Hamada and Balakrishnan (1998) for a summary and comparison of related techniques. Let X = (L,U), where L represents the set of columns in X that are identified as location active, plus 1 = (1, 1,..., 1)´ for the intercept. Here, U represents the set of u columns that are not in the model, i.e. unidentified location effect columns. The fitted values are then , and the observed residuals are . To study dispersion effects, we let

where and the j are unknown parameters. This is a reparametrization of a model used by Cook and Weisberg (1983) and Davidian and Carroll (1987), among others. Defining = =   As such, d is the measure of the true dispersion effect for column d. Throughout the rest of this paper, we use the index j (as opposed to xj) to refer to a generic column in X, and we call a column suspected of producing a dispersion effect column d.

There are n/2 pairs of columns whose interaction column (component-wise product) appears in column d. For j U, there are 0 ≤ gu/2 pairs of columns (j, j´) such that the interaction of columns j and j´ occurs in column d, i.e. xijxij´ = xid, i = 1,..., n. Also, there are 0 ≤ tu single columns, j U, such that there does not exist a column j´ U such that xijxij´ = xid, i = 1..., n. These definitions imply that 2g + t = u. Indexing the interaction pairs in U by the superscript (f) and the singles in U by the subscript q, McGrath and Lin (2001a) showed that

where is the sample variance of the ei | xid = +1. Similarly, we have

From Equations (1) and (2) we see that the qs can not create spurious dispersion effects as they appear in both and . These location effect estimators do, however, dampen the estimate of a real dispersion effect. Additionally, subtracting Equation (2) from Equation (1) leads to

and if a single pair of these location effects, is active, then

Thus, an observed significant dispersion effect may be the result of one or more pairs of unidentified location effects.

Box and Meyer (1986) and Montgomery (1990), among others, have studied the sample variances defined by Equations (1) and (2) in order to detect dispersion effects. Others, such as Bergman and Hynén (1997) and McGrath and Lin (2002) adapted L by also including the q columns, so the residuals from the +1 level of the suspected dispersion column are uncorrelated with those at the -1 level. With this adaptation, under a normality assumption and the null hypothesis of equal variances, and are independent, and Bergman and Hynén's DBH We refer to a location model that is used for this test as a BH model. Fitting a BH model, there are g pairs of location effects that are not included in the location model used for the dispersion effect test (u = 2g and t = 0). If multiple dispersion effects are suspected, then the recently developed test by McGrath and Lin (2001b) may be applied instead of or in conjunction with use of DBH to test for dispersion. Our goal is to develop methodologies that remove the confounding between a pair of location effects and a dispersion effect. In the remainder of this paper, we assume that "obvious" location effects have been identified (by normal probability plotting or Lenth's method for example) and that a single (apparent) dispersion effect is detected in column d. Additionally, we assume that the apparent dispersion effect is either a real dispersion effect or a spurious dispersion effect caused by two unidentified location effects. As we show later, the procedure we develop is robust to this assumption.

The rest of this paper is organized as follows. We first discuss an example from the literature where there is the location-dispersion confounding described above. In the next section, the structure of the columns of interest, i.e., those of the identified location effects, the suspected dispersion effect, and the pair of suspected location effects, is analyzed. This structure determines the minimum number of design points that must be replicated to remove the location-dispersion confounding. We then apply the procedure to the example, and show how an additional four runs could be used to remove the locationdispersion confounding. As replication is not always feasible, we then develop a formal test that is performed using the original unreplicated data. Since the test is designed to detect two location effects conditioned on detecting what appears to be a dispersion effect, we also assess its performance under more general conditions. Finally, in the last section we summarize the proposed procedures and briefly discuss the difficulties involved in developing alternative (e.g., likelihood and Bayesian) approaches.


Read Full Article (PDF, 177 KB)

Download All Articles

Featured advertisers


ASQ is a global community of people passionate about quality, who use the tools, their ideas and expertise to make our world work better. ASQ: The Global Voice of Quality.