ONE GOOD IDEA
Taoism and Statistics
Reach greater heights of statistical thinking
by Scott A. Rutherford
Taoism is a philosophical, ethical and religious tradition that emphasizes living in harmony with the Tao, a Chinese concept for the way, path or route. When you understand the Tao of building statistical formulas, your ability to develop meaningful performance measures is enhanced.
Statistics has two branches: The first describes objects (descriptive statistics) and the second predicts outcomes (inferential statistics). The second can be described as the study of variation. Minimizing variation improves the accuracy and precision of predictions. There is a simple pattern to the study of variation that, if it were taught, would ease the pain and suffering of new quality practitioners.
Many statistical formulas relating to the study of statistical variation are based on a simple premise:
For example, a t distribution has the following equation: x – μ / s, where x is our data point of interest, μ is our population mean and s is our standard deviation. In this equation, we are interested in the probability of our data point being the true population mean. So, we compare the actual value (x) to what we perceive the population mean (μ) to be. We can compare the difference to the variation (s) that exists. All formulas relating to hypothesis testing and building confidence intervals are built this way.
The study of variation becomes more complicated when we discuss the relationship between two or more things. The primary tool to study this relationship is analysis of variance (ANOVA). Though the formulas are more complicated, they are based on the basic premise of:
In ANOVA, the numerator is best described as explained variation and the denominator as unexplained variation. Hypothesis testing compares differences between groups as they relate to the differences in the entire data set, otherwise known as among groups. When there are differences between groups, that difference is usually much greater than the variation within the data set.
For most new quality practitioners, their first exposure to variation is through the terms common cause and special cause variation. Common cause variation is the expected variation inherent in any process. Special cause variation is usually dramatic and can be attributed to one or more root causes. We can tie these two types of variation to statistics with the following relationship:
existing variation =
expected variation =
common cause variation.
Process capability is based on the same statistical relationship. The basic equation for process capability is:
upper specification limit –lower specification limit
When stable, a process should perform between the upper and lower specification limits, which represent the amount of variation that the customer expects or allows from the process. Sigma (σ) is the variation found in the population of things being produced by the process. Generally, quality professionals aim to minimize process variation (6σ) so that products and services are within specification limits and meet customer requirements. We can restate process capability as:
common cause variation.
As quality practitioners become more experienced, they are often asked to develop performance measures for their processes. Developing performance measures is a difficult task. It becomes easier after you understand how the smart people before you developed the formulas we use today. Understanding the Tao advances you on the road to thinking statistically.
Scott A. Rutherford is the director of the U.S. Navy Mid-Atlantic Regional Calibration Center in Norfolk, VA. Rutherford holds a master’s degree in operations research from the University of Delaware in Newark. A senior member of ASQ, Rutherford is past chair of the ASQ Hampton Roads Section in Virginia and has been an ASQ-certified Six Sigma Black Belt. He also blogs as part of the ASQ Influential Voices program.