Developing Control Charts and Illustrating Type I and Type II Errors

October 2000
Volume 7 • Number 4


Focus On The Classroom
Developing Control Charts and Illustrating Type I and Type II Errors

In this article, an entertaining class exercise that effectively illustrates the development of control charts, the impact of various levels of significance, and the two types of errors that can be made when using control charts is described. Actual sample data drawn from a sampling bowl in an in-class simulation are used to construct control charts for proportion defective. Then, how subsequent draws from the sampling bowl can result in either a correct decision that the process remains in control or a Type I error is illustrated. The proportion defective in the bowl is increased to simulate a process mean shift and illustrate how the new data can result in a correct decision that the process is out of control or a Type II error. This exercise also generates valuable insights about the trade-offs between significance levels and Type I and Type II errors.

Key words: confidence intervals, sampling, statistical process control

by Elisabeth J. Umble, Texas A&M University and M. Michael Umble, Baylor University


Do you eagerly anticipate teaching students or explaining to practitioners how to develop and use control charts? If not, you might consider adopting the approach we use to make this a stimulating exercise that helps people understand and internalize key statistical process control concepts. Our approach is based on the axiom that the best way to generate interest and internalize learning is to make sure the target audience actively participates in the exercise. In this paper, we outline the basic process we have successfully used in numerous classes to introduce control chart concepts. The process is illustrated with data and analyses drawn from an actual class exercise.

The only required equipment is a sampling bowl containing a sufficiently large number of small beads of two different colors and a sampling device that can quickly select and display individual samples. We use a sampling bowl consisting of 3000 beads–mostly white with a small number of red–and a sampling paddle that has 100 small pockets that selects 100 beads per sample. The red beads denote “defective” items; the white beads represent “good” items. Such sampling bowls are available from many sources, such as Lightning Calculator (; 248-641-7030).


In our exercise, inspired by W. Edwards Deming’s well-known red bead experiment (Deming 1982), the class is first asked to visualize a process that produces outcomes that can be classified as either acceptable or unacceptable. Appropriate examples include picking a customer order, repairing an automobile engine, or producing a computer chip. The class is then asked whether managers should be interested in the likelihood of the process generating a successful outcome. After receiving agreement, we introduce our sampling bowl, making the connection between selecting red or white beads from the bowl and the visualized process producing defective or good outcomes.

In the first phase of the exercise, the sampling bowl has a population consisting of 2760 white and 240 red beads–an 8 percent “defective” population. At this point, however, the class is not told the percentage of red and white beads in the population. The class is told that we want to estimate the proportion of defective items generated by the process. This is accomplished by taking repeated samples from the bowl.

To get the class involved in the exercise, we divide the class into two teams and announce that the teams will compete to see which team can produce the lowest proportion of defective items. Alternating turns, each team sends a team member to select a sample of 100 beads using the sampling paddle until every person has drawn a sample. The number of red beads in each sample represents the number of process defects generated by that team member. An unbiased “inspector” verifies the number of red beads selected by each participant and announces the number of defectives. The instructor keeps a running tally of the number of defectives generated by each team. If there is a small number of members for each team, individuals may be asked to sample more than once in order to generate a sample large enough to be used to construct the control chart. To help generate enthusiasm, we promise a reward (such as a piece of candy) to each member of the team that generates the lowest percentage of defectives. It is amazing the amount of cheering, jeering, and taunting that adults are capable of–all for a small piece of candy. However, to avoid a letdown, after the “winning” team has selected its candy treat, each member of the losing team also receives a consolation sweet.

Once the sampling is completed, ask the class to derive its best point estimate of the population proportion defective, . After a few minutes, introduce the formula for the estimate, ,

where i denotes the fraction defective in sample i, Di represents the number of defective units in sample i, m denotes the number of samples, and n is the number of observations per sample. As a general rule, m should be at least 20 (Montgomery 1991, 149).

In our class exercise, two teams of 15 screaming adults drew 30 samples of size 100 that resulted in the following numbers of defectives: 8, 10, 9, 4, 10, 8, 5, 3, 13, 6, 11, 10, 7, 6, 8, 4, 8, 7, 12, 11, 7, 11, 11, 7, 10, 8, 9, 6, 11, 6. Using the formula, the class calculated the sample estimate = 246 / (30 x 100) = .082. When asked if they knew for sure what the true proportion of defectives was, students admitted they did not, but they all agreed that they had a pretty fair idea.


We review the concept that an interval estimate of a population proportion, , has the form


The resulting interval,


is a 100 (1 - )% confidence interval for the unknown population proportion and (1 - ) is the confidence coefficient. The interval has probability (1 - ) of including the true population proportion.

We then extend the notion of a confidence interval for ( to control charts. A control chart consists of three lines representing the estimated process mean, an upper control limit (U or UCL), and a lower control limit (L or LCL). The three lines are computed as follows:

Process mean estimate =,

The parameter z represents the number of standard deviation units that the limits fall from the process mean estimate, . For three-sigma limits (that is, z = 3), the interval

should contain almost all (99.74%) of the sample proportions,i , in repeated sampling from the original population. For two-sigma control limits, approximately 95.44% of the sample proportions in repeated sampling should be in the interval

Continuing our example where = .082, we ask the class to construct the upper and lower control limits for the proportion defective using both three-sigma and two-sigma limits. For the three-sigma control chart,

(Since proportion defective cannot be negative, the LCL actually used is 0.)

For the two-sigma control chart,

The three- and two-sigma control charts appear in Figure 1 along with the sample proportions from the 30 samples.

While three-sigma limits are widely used in practice, the choice of the z value used should be dictated by economic considerations. For example, if the losses associated with allowing the process to operate in an out-of-control state are large relative to the cost of investigating and possibly correcting problems, then a smaller z value, such as 2 or 2.5, may be more appropriate.

Historically, some statisticians (Montgomery 1991, 110) have also suggested using two sets of limits on control charts. The three-sigma limits are the usual limits used to indicate when a process is out of control. An additional set of two-sigma limits may be used as warning limits. Two of three consecutive points falling outside the warning limits might arouse suspicion that the process may be out of control. This might cause a response, such as selecting additional samples, to attempt to determine whether or not the process is in control.


Control charts are used to monitor processes by selecting samples over time and sequentially plotting the sample statistics. Control limits are normally chosen so that if the process is in control, nearly all of the sample points will fall between the limits. As illustrated by the generic control chart in Figure 2, if a single sample proportion falls between the upper and lower control limits, it is concluded that, on the basis of the sample, the process is in control. That is, there is insufficient statistical evidence to indicate that the process mean has shifted and that the process is producing either more or less defectives than is typical. However, if a sample proportion falls above the upper control limit or below the lower control limit, the process is judged to be out of control. In reality, if a sample proportion falls outside the interval, we have either observed an unlikely event or process quality has changed, and is no longer an accurate measure of the current process mean.

The power of control charts is that they provide a graphical representation of how the process is behaving over time. Even when all points fall inside the control limits, control charts can help identify systematic, nonrandom patterns that indicate that the process may be out of control. Thus, supplementary criteria are typically used to increase the sensitivity of control charts to small process shifts (Montgomery 1991, 117). For example, patterns such as two of three consecutive points outside the two-sigma limits, four of five consecutive points more than one standard deviation from the center line, or seven consecutive points above or below the center line may prompt further investigation.

Type I and Type II Errors

As shown in Figure 3, four general cases are possible when making conclusions about the process.

  1. Case 1. The process is in control and the sample proportion falls within the control limits. The conclusion that the process is in control is correct. We refer to this as correct conclusion A.
  2. Case 2. The process is in control but the sample proportion falls outside the control limits. The conclusion that the process is out of control is incorrect–also called a Type I error.
  3. Case 3. The process is out of control but the sample proportion falls within the control limits. The conclusion that the process is in control is incorrect–also called a Type II error.
  4. Case 4. The process is out of control and the sample proportion falls outside the control limits. The conclusion that the process is out of control is correct. We call this correct conclusion B.

Next we explain that

P(Type I error) = ;

P(correct conclusion A) = 1 - ;

P(type II error) = ; and

P(correct conclusion B) = 1 -

Moreover, is the level of significance of the test, and is the complement of value 1 - that determines the appropriate z value. The value of depends on how much the process mean has shifted. When the actual value is close to the assumed process mean, it is more difficult to detect the shift. The ability of a testing procedure to correctly detect the shift is equal to 1 - , and is referred to as the power of the test.

The Right Sample Size

Now that the class understands the control chart format, Type I errors, Type II errors, and correct conclusions A and B, we ask class members how we would monitor the process. When someone answers, “by taking samples,” we show them two additional sampling paddles (one that selects 50 beads and one that selects 25 beads) and ask them how large should the sample size be. Can we save some time by simply taking a sample of 50 units? We allow students time to justify and internalize that the control chart limits are based on n = 100, and are therefore only valid for that size sample.

Sampling With an Unchanged Process Mean

We continue the exercise by resampling and plotting the sample proportions on three-sigma and two-sigma control charts based on the previously calculated process mean and control limits. We ask one person to select a sample from the population (which still contains 8 percent defective), and plot the sample proportion on both control charts. In the class exercise reported here, the first sample yielded 11 defectives. So we ask the class for its conclusion about the process: Is it in control or out of control? Students typically answer, “based on that sample, we would conclude the process is in control.” We then refer to Figure 3 and emphasize that since the proportion defective for the population has not changed, that the conclusion is a type A correct conclusion.

Let the class continue to draw samples until a sample yields a proportion defective that falls outside the control limits. Most likely, this sample proportion will fall outside the limits of the two-sigma chart but inside the limits of the three-sigma chart. In our exercise, the first eight samples from the population result in numbers defective: 11, 11, 8, 12, 6, 7, 6, 8. The ninth sample yielded 15 red beads for a proportion defective of .15. Ask the class for the appropriate conclusion. Clearly, if a two-sigma chart (where UCL = .1369) is used, the erroneous conclusion is that the process is producing more than the expected proportion of defectives and is out of control. Make sure the class understands that this is an example of a Type I error because the population proportion defective has not changed. It is still 8 percent. If the three-sigma chart (where UCL = .1643) is used, then the conclusion is still that the process is in control–the correct conclusion. Emphasize that the Type I error is more likely to occur when using two-sigma control charts ( = .0456) than when using three-sigma charts ( = .0026).

Continuing to sample to get a feel for the process, sample numbers 10 through 30 yielded the following numbers of defectives: 11, 12, 10, 7, 7, 4, 2, 6, 5, 5, 5, 6, 11, 13, 4, 9, 3, 6, 8, 8, 7. The proportion of defective items for all 30 samples is illustrated in Figure 4. On the three-sigma control chart, all observations fall within the limits. On the two-sigma control chart, note that the 9th and the 16th samples fall outside the limits. Thus, 6.67 percent of the observations fall outside the two-sigma control limits. This is not atypical since we expect approximately 4.5 percent of the observations to fall outside the limits of a two-sigma chart due simply to random variation. (Note that the 16th sample that falls below the two-sigma control limit may not cause any action since a low number of defectives is desirable. Though if management believes this decrease represents a true shift in the proportion of defectives produced, an analysis may be warranted to try to discover an underlying cause for the improvement in an effort to replicate the result.)

Sampling With a Shifted Process Mean

In the next phase of the exercise, remind the class that the original proportion defective was .08. But now the process is using materials supplied by a new vendor, or alternatively, that an inexperienced temporary worker is being used in the process. Then change the proportion of red and white beads to 300 red and 2700 white so that the true proportion defective jumps to 10 percent. Continue having individuals take samples as before. From our exercise, the number of defectives for the next 30 samples (samples 31 through 60) are: 9, 15, 8, 13, 9, 18, 10, 12, 7, 15, 10, 14, 8, 11, 10, 6, 7, 14, 17, 7, 14, 12, 8, 9, 12, 13, 9, 11, 10, 13. These data are illustrated in Figure 5.

We ask the class what it can conclude after sample number 31 is selected. Students all agree that the sample data leave no choice but to conclude that the process is in control. Referring back to Figure 3, the class should be able to identify this obviously incorrect conclusion as a Type II error. Continuing with the sampling, sample 32 yields 15 percent defective. This falls outside the control limits for the two-sigma chart. Using this chart leads to the correct conclusion that the process is out of control. However, .15 falls within the limits of the three-sigma chart, leading to the incorrect conclusion that the process is in control. This is another Type II error. We ask the class to explain which of the two charts has the higher probability of the Type II error.

Note that the three-sigma control chart eventually will pick up the shift in the process mean proportion defective (sample 36 with 18 percent defective and again at sample 49 with 17 percent defective), it just takes longer for the three-sigma chart than for the two-sigma chart. We also make sure the class notes the evident shift in the plot of points on the control charts.


Specifying the control limits is one of the critical decisions that must be made in designing a control chart. Clearly, there is a trade-off between committing a Type I and a Type II error. By moving the control limits further from the estimated process mean, we decrease the risk of a Type I error–the risk of a point falling beyond the control limits, indicating an out-of-control condition, when no assignable (or specific “findable”) cause is present. However, widening the control limits also increases the risk of a Type II error–the risk of a point falling between the control limits when the process is really out of control. If we use tighter control limits, the opposite effect occurs: The risk of Type I error increases while the risk of Type II error decreases.

Three-sigma control limits are often chosen in practice because they provide a good balance between committing Type I and Type II errors. However, the relative costs associated with making Type I and Type II errors may be a critical factor in determining an appropriate sigma level.


Deming, W. Edwards. 1982. Quality, productivity, and competitive position, Cambridge, Mass.: Massachusetts Institute of Technology, Center for Advanced Engineering Study.

Montgomery, Douglas C. 1991. Introduction to statistical quality control. 2d ed. New York: John Wiley & Sons.


Elisabeth J. Umble is a visiting professor in the department of statistics at Texas A&M University. She has published numerous articles in both statistics and manufacturing journals, including “Quality: The Implications of W. Edwards Deming’s Approach,“ In the Encyclopedia of Production and Manufacturing Management (2000) and “A Distribution-Free Bayesian Approach for Determining the Joint Probability of Failure of Materials Subject to Multiple Proof Loads,” in Technometrics (1999).

Her research interests include manufacturing and management control systems, quality management reliability, and Bayesian statistics. Umble’s papers have been presented at meetings of the American Statistical Association, the Decision Sciences Institute, and the Project Management Institute. In addition to ASA and DSI, she is also a member of ASQ and the Institute of Industrial Engineers.

Umble earned a Ph.D. in statistics from Baylor University. She may be contacted as follows: Department of Statistics; 437 Blocker Building; Texas A&M University; College Station, TX 77843-3143; 979-845-3141; Fax: 979-845-3143; E-mail:

M. Michael Umble is a professor in the department of management and former director of the Center for Manufacturing Excellence at Baylor University. He is a Certified Quality Engineer and a Certified Fellow in Production and Inventory Management with the American Production and Inventory Control Society.

Umble has published more than 60 articles and written three books on synchronous manufacturing and synchronous management concepts. He is a Jonah’s Jonah with the Avraham Y. Goldratt Institute, and has extensive consulting and education experience with manufacturing systems. He is also a past president of the Heart of Texas chapter of APICS. He is a member of the Decision Sciences Institute, the American Productivity and Inventory Control Society, the Project Management Institute, and the National Association of Purchasing Managers.

Umble earned a Ph.D. in quantitative methods from Louisiana State University. He may be contacted as follows: Department of Management; P.O. Box 98006; Baylor University; Waco, TX 76798-8006; 254-710-6239; Fax: 254-710-1093; E-mail:

If you liked this article, subscribe now.

Featured advertisers


(0) Member Reviews

Featured advertisers

ASQ is a global community of people passionate about quality, who use the tools, their ideas and expertise to make our world work better. ASQ: The Global Voice of Quality.