﻿ Confidence Interval Coverage for Designed Experiments Analyzed With GLMs - ASQ

# Confidence Interval Coverage for Designed Experiments Analyzed With GLMs

There are many industrial experiments where the response variable is nonnormal. Traditionally, variance-stabilizing transformations are made on such a response in order to obtain properties needed to use ordinary least squares and analysis of variance. Generalized linear models (GLMs) offer a powerful alternative to data transformation. Specifically, the performance in response estimation and prediction for a GLM is often superior to the model built using data transformations. The confidence interval constructed around the estimate of the mean for each experimental run provides the experimenter with critical information about model quality. In generalized linear models, confidence intervals are based on asymptotic theory. As such, they are regarded as statistically valid only for large samples. Therefore, in order to use confidence intervals to compare models, it is essential to evaluate these asymptotic intervals in terms of coverage for sample sizes typically encountered in designed industrial experiments. This paper uses Monte Carlo methods to investigate the coverage of confidence intervals for the GLM for factorial experiments with 8, 16, and 32 runs.

Key Words: Design of Experiments, Generalized Linear Models, Nonconstant Variance, Nonnormality.

By Sharon L. Lewis and Douglas C. Montgomery, Arizona State University, Tempe, AZ 85287 and Raymond H. Myers, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061

Introduction

DESIGNED experiments are used to build a model that describes the relationship of a response variable of interest to one or more independent (or design) variables. A model such as is often used to predict the response, thereby allowing the experimenter to resolve important issues about the system under study. Consequently, the performance of the model in response estimation and prediction is often of foremost importance. It is the confidence interval, constructed around the estimate of the mean for each experimental run, and its associated length or precision that provides the experimenter with critical information about the uncertainty associated with estimating the response variable at points of interest in the experimental region.

Ordinary linear least squares is the most widely used technique for building models based on data from designed experiments. The method of least squares produces best linear (i.e., minimum variance) unbiased estimators of the regression coefficients ßj (Montgomery and Peck (1992), Myers (1990)). The optimality properties of ordinary least squares depend on the assumption of constant variance. The widespread growth in the application of designed experiments has led to many situations where the response is nonnormal, such as counts of defects, proportions defective, or times to failure. These responses may follow Poisson, binomial, and gamma distributions, respectively. In each case, the variance is not a constant, but is rather a function of the mean. If the fitted linear model is correct, the least squares estimators will still be unbiased, but will no longer possess the minimum variance property.

One option for these nonnormal response situations is to simply ignore the problem and use normal theory ordinary least squares with the assumptions violated. Many researchers and practitioners assume that factorials and fractional factorials are so robust that such an analysis will still be useful. However, the more traditional approach is to apply a variance-stabilizing transformation to the response variable to bend the data into shape. This allows application of classical least squares to the transformed data. A third approach is the use of the generalized linear model (GLM). With the GLM, normality and constant variance are no longer required.

Myers and Montgomery (1997) give a tutorial on GLMs. In their tutorial, they show several examples comparing models built with ordinary least squares and data transformations and those built with a GLM approach. In each case, they find that a better model is possible with the GLM, where “better” is measured in terms of the model performance in response estimation and prediction. Myers and Montgomery use the length of the confidence interval on the mean response for comparing models (See also Lewis, Montgomery, and Myers (2001)).

In generalized linear models, confidence intervals are typically based on (asymptotic) Wald inference; therefore, these intervals are useful for a large sample size n (there are exact confidence intervals available for logistic regression, a special case of the GLM, but we do not consider these intervals in this paper). Designed experiments usually involve small n. Therefore, in order to use confidence intervals in comparing model performance, it is essential to evaluate these asymptotic intervals in terms of coverage and precision for small sample applications. The purpose of this paper is to evaluate the properties of two different types of Wald inference based confidence intervals for the GLM for small sample sizes such as are typically encountered in industrial experiments. This is accomplished with a Monte Carlo simulation study. 