Ensuring R&R

Is your decision process trustworthy?

by Scott Force

Most quality professionals are familiar with the traditional variable measurement systems analysis (MSA) tool—also known as the gage repeatability and reproducibility (R&R) study. When I was a quality engineer in the automotive industry, I spent a lot of time ensuring there were gage R&R studies for the critical dimensions that were checked by production quality inspectors.

MSAs also are a critical part of the measure phase of the define, measure, analyze, improve, control process (see Online Figure 1). They ensure any detailed statistical analysis that follows in the analyze phase is based on sound, trustworthy data. When considering attribute data, however, I always felt my MSA process was weak. After learning a more formal process during my Six Sigma Black Belt training and using Minitab’s attribute agreement analysis and kappa statistic, I realized the plethora of opportunities to better measure and improve attribute measurements.

Online Figure 1

Aside from the basic go/no go gages widely used in manufacturing, attribute measurement systems are widespread in transactional (nonmanufacturing) processes and, in my experience, are rarely assessed for R&R. Examples include deciding:

  • What defect code to assign to a customer complaint.
  • What insurance rejection code to assign to a medical bill.
  • Which standard operating procedure to use.
  • Whether to choose the “yes vs. no” step in a process flow.

In many cases, the decisions made feed a data collection system ultimately used by management to create a Pareto chart of the various defect categories to which to assign process improvement efforts. If the decision-making process is inconsistent, these Pareto charts may point a team toward the wrong focus area, derailing its problem-solving process.

The overall attribute agreement assessment math is straightforward, with the exception of the kappa statistic. Minitab’s output provides three basic statistics, as well as the kappa value (see Online Table 1). To understand the consistency of an individual appraiser, there is a “within appraiser” score, which is based on a 0% to 100% scale. The higher the value, the more consistent the individual is at making the same decisions repeatedly. This is the repeatability metric.

Online Table 1

The “between appraiser” score compares all appraisers in the study. The higher the value, the more consistent all appraisers are at making the same decisions repeatedly. This is the reproducibility metric.

The “appraiser vs. standard” score compares all appraisers to the standard, which provides a most telling metric—how often all appraisers make the same correct decisions repeatedly. This is the accuracy metric.

Kappa is the final metric and is a little more complex. It considers the expected vs. observed responses and, similar to the other metrics, is expressed as a percentage. It quantifies how much better than random chance the decision-making process is at getting the results that were obtained in the study.

All three metrics generally are acceptable if they are greater than 70%.

Consider how many decisions are made in engineering, accounting and doctors’ offices every day, for example. If they were all assessed, how consistent would they be?

As a strategic planning tool, and similar to how value stream mapping helps highlight waste and steers teams to the largest areas of opportunity, a series of attribute agreement assessments on decision-making processes may indicate a need for future Six Sigma projects or kaizen events.

Scott Force is a continuous improvement leader in Greensboro, NC. He earned a bachelor’s degree in manufacturing engineering from Miami University in Oxford, OH. A senior member of ASQ, Force is an ASQ-certified quality technician, engineer and Six Sigma Black Belt.

Average Rating


Out of 0 Ratings
Rate this article

Add Comments

View comments
Comments FAQ

Featured advertisers