In this paper we investigate the use of the joint estimation (JE) outlier detection method as a statistical process control method for short-run autocorrelated data. Because JE is able to differentiate between four different outlier (out-of-control observation) types, performance is reported with respect to its ability to locate the out-of-control observation and identify the associated type. This is of particular interest to practitioners because the four different types may indicate different problems in the process. The results show that JE performs better for AR(1) models when the out-of-control observation is the last observation than for MA(1) models. However, JE is better able to distinguish between the four outlier types for MA(1) models than for AR(1) models. An example using real data is also provided.Key Words: Autocorrelated Observations, Statistical Process Control

*By* **C.M. Wright, Western Carolina University, Cullowhee,
NC 28723** and **D.E. Booth **and** M.Y. Hu, Kent State
University, Kent, OH 44242**

**Introduction**

One basic assumption of traditional statistical process control (SPC) methods is the existence of sufficient historical data for proper analysis. At least twenty-five or thirty samples of size four or five each are assumed to be necessary for the purposes of establishing control limits for process monitoring (Montgomery (1996) and Quesenberry (1991a)). This is a concern for the practitioner, because in SPC applications there are often situations where the process does not yield enough observations for effective use of traditional SPC methods. This short-run data can be the result of situations that do not allow for frequent process measurement such as short production runs, process start ups, new equipment, major tool changes, different raw materials, and different production processes.

Traditional SPC methods in Phase 2 are often based on the assumption that the process data are both independent and identically distributed (i.i.d.). This assumption is violated in many SPC data situations. This is an important consideration for practitioners, since autocorrelated data are prevalent in the chemical and continuous process industries as well as in computer integrated manufacturing environments (Alwan and Radson (1992), Baxley (1990), Berthouex et al. (1978), Ermer et al. (1979), Harris and Ross (1991), Hunter (1990), and Montgomery and Friedman (1989)). Thus, other alternatives must be investigated for detection of out-of-control observations.

Research has been conducted regarding the use of SPC for short-run data. Hillier (1969) investigates mean and range charts for a small number of subgroups. Quesenberry (1991a) proposes mean and variance charts for short-run and long-run data where the mean and variance of the process may or may not be known. Quesenberry (1991b) investigates both short-run and long-run process data for percentage defective charts where the random variable is binomial. Quesenberry's focus in both articles is the use of Q charts (i.e., standardized normal charts) which permit the user to plot different statistics on the same chart. Wasserman and Sudjianto (1993) critique Quesenberry's analysis by suggesting that Q-charts do not perform well if there are special causes present when the process begins. They suggest a dynamic linear model where the model and control chart parameters are updated using a Bayesian estimation framework. Wasserman (1994) considers a generalization of the exponentially weighted moving average (EWMA) chart where the smoothing constant has an adaptive weighting factor. Del Castillo and Montgomery (1996) recommend an optimal economic design of the mean chart for short and long-run processes. Pyzdek (1993) discusses various methods to deal with short-run data.

Research has also been conducted regarding the violation of the i.i.d. assumption for SPC. Faltin et al. (1997) review this literature in detail. Several model-based approaches exist for non-i.i.d. data. One approach is to modify the control limits when the data are autocorrelated. Various tactics for accomplishing this are discussed by Alwan (1992), Vasilopoulos and Stamboulis (1978), Yashchin (1993), and Wardell, Moskowitz, and Plante (1994). Another approach is to plot the residuals of a time series model on traditional control charts as in Alwan and Roberts (1988), Montgomery and Friedman (1989), Notohardjono and Ermer (1986), Yourstone and Montgomery (1989), and Wardell, Moskowitz, and Plante (1992). Runger, Willemain, and Prahabu (1995) use Cusum charts with the residuals. Alwan (1991) and >Montgomery and Mastrangelo (1991) use one control chart with varying control limits. Mastrangelo and Montgomery (1995) plot the residuals using an EWMA. MacGregor (1991) cautions that some traditional SPC methods, such as charts of residuals, are highly inefficient in detecting an out-of-control observation in autocorrelated data. Moreover, out-of-control observations are often masked by the artifacts of autocorrelation. Alwan and Radson (1992) and Runger and Willemain (1996) use model-free methods to map the problem to uncorrelated data and then apply traditional control charts. Reynolds et al. (1996) consider a sampling scheme for the mean chart where the interval between samples is variable rather than fixed.

However, very little research has been conducted with regard to the violation of both the i.i.d. and long-run data assumptions simultaneously. Del Castillo (1996)considers a multivariate short-run case for autocorrelated data with shifts and trends for semiconductor manufacturing. This is a special case for manufacturing systems with automatic proportional-integral-derivative-type controllers. All other known research focuses on the univariate case. Crowder (1992) addresses a cost strategy to decide when to overhaul or adjust a piece of equipment for short production runs using an IMA (1,1) model. He associates a quadratic loss with any deviation from target and assumes that a fixed cost is associated with any process adjustment. Booth and Isenhour (1985) consider a procedure, based on Denby and Martin (1979), for AR(1) processes for detecting two types of out-of-control data which are defined in the outlier detection literature as additive outliers (AO) and innovational outliers (IO). Prasad, Booth, Hu, and Deligonul (1995) use the joint estimation method of Chen and Liu (1993) to detect nuclear materials losses in autoregressive moving average (ARMA) time series of short length. However, they investigate only 6 short-run data sets. Likewise, Prasad, Booth, and Hu (1995) show how joint estimation can be used to monitor processes through the use of one real data set with 45 observations. >Wright et al. (1999) investigate the use of joint estimation to detect the outlier when it is the last observation in short-run autocorrelated data. They find that joint estimation is more effective for detecting the outlier in the period when it first occurs than the EWMA chart for both real and simulated autocorrelated short-run data.

In this paper we investigate the
use of the joint estimation (JE) outlier detection method
of Chen and Liu (1993a, 1993b) as a SPC technique for short-run
autocorrelated data. This paper is significantly different
from those of Chen and Liu (1993a, 1993b); in both of their
papers, Chen and Liu are concerned only with sample sizes
of

The JE method identifies four different outlier types. Each outlier type may indicate a different problem in the process. Specifically, the ability of JE to detect the outlier (hereafter used interchangeably with out-of-control observation) when it occurs is investigated for different lengths of short-run autocorrelated data. The next objective is to determine how well JE is able to identify both the location and type of out-of-control observations. The false alarm rate is also considered.

In the next section we discuss the four outlier types and the joint estimation outlier detection method. Because it is necessary to identify the autoregressive integrated moving average (ARIMA) model type prior to the use of JE, a review of the literature related to model identification for short-run data is necessary. Then we investigate the use of JE to identify the out-of-control location and its type through a simulation study. Finally, we provide an example using real short-run autocorrelated data and provide conclusions.

FIGURE 1. AO, TC, LS, and IO for a AR(1) Model ( = .2). Effect size is 5.

Read Full Article (PDF, 320 KB)