3.4 PER MILLION
Knowing the Drill
Examining data more closely
to find answers
by Joseph D. Conklin
In the quest to solve quality problems, drilling down into data is a time-honored strategy that serves Six Sigma practitioners well. A common form of this strategy is to divide data into smaller and smaller groups, repeating the same analysis at each division until the results suggest possible causes worth investigating.
Before illustrating this form of drilling down and getting to an example that involves sales data, let’s cover some preliminary topics:
Analysis of variance (ANOVA). The highest level of data in the example is territory. We show results from an ANOVA that strongly suggests the average sales differ by territory.1, 2
Multiple comparison procedure (MCP).If an ANOVA concludes that groups of data are different, we naturally want to know where the differences lie. After more than two groups of data are involved, MCPs are useful.3 There are many such procedures. The general idea is to compare the averages for each group of data with one another. The MCP tests which groups—to a predetermined level of risk—can be considered different.
Some procedures test all possible pairs of averages. Other MCPs are designed to test only some of the all-possible pairs. Restricting the test to only some of the all-possible pairs requires special prior knowledge.
For simplicity and generality, we select a procedure from the all-possible pairs category: the Tukey honestly significant difference (HSD) procedure. It is a good, general-purpose procedure.
Comments about risk in statistical analysis. The Tukey HSD procedure belongs to the universe of statistical tests. Conclusions based on statistical tests carry some risk of being wrong. Before performing the test, the analyst acknowledges this by setting a risk level he or she can live with. In the sales data, we use 5% risk level.
This is a popular level in explaining statistical tests, and it’s useful in many contexts. Depending on the application, the risk level may be set by law or regulation. If the present test is part of a continuing series performed under similar circumstances, a risk level that worked well in the past may serve as a precedent.
Each risk level implies a certain sample size. If law, regulation or precedent provides no guidance, the analyst can consider a range of risk levels, assess the corresponding sample sizes, and select the level that leads to the largest sample size he or she can afford. Statistical risk is not the only one to consider in planning studies. The analyst should avoid choosing a sample size beyond his or her ability to manage it.
Assumptions of the Tukey HSD procedure. All statistical tests come with assumptions as well as risks. For the Tukey HSD procedure, there are three important assumptions and risks: independence, similar variation within groups and normality. In practice, these assumptions should be tested. Space limitations prevent a complete explanation in this column, but here are some brief comments.
When the data are organized by level of geography, they can be considered independent as long as each observation is a member of one and only one level. That is the case here. With respect to the normality assumption, the Tukey HSD procedure is somewhat robust to departures from this assumption.
A rough-and-ready way of imagining the similarity of variation assumption is to think of each group of data and its corresponding histogram. If the histograms can be moved in such a way so they all almost perfectly overlap, the assumption is met. If the assumption is not met, the Tukey HSD procedure can be adjusted.4
Sales data example
Sales data appear in Online Table 1. The units are thousands of dollars per customer. The data are organized at three levels of geography:
- East and West territories.
- Echo, Lima, Juliett and Sierra counties.
- Towns of Ash, Beech, Hickory, Maple, Oak, Pine, Sycamore and Walnut.
There are 12 observations per town, 24 per county and 48 per territory. The average sales for the East territory are 39.000 and 57.188 for the West. The ANOVA in Table 1 supports the conclusion that the higher average in the West is real and not apparent. Let’s drill down to the county level for more detailed insight.
Tukey’s HSD procedure for the four towns
The steps for Tukey’s HSD procedures are given in Table 2.5 To apply it to the county level, we first need the average sales for the four counties. These can be found in Table 3. Another necessary quantity, the standard error of the mean for sales, also appears in Table 3. For an idea of the calculations behind the standard error of the mean, see the example in Table 4. The standard error of the mean is generally available as an output of statistical computing programs.
With four counties and their average sales, there are six pairs to compare:
- Echo and Lima.
- Echo and Juliett.
- Lima and Juliett.
- Sierra and Echo.
- Sierra and Lima.
- Sierra and Juliett.
When we take the standard error of the mean from Table 3, 6.093, and multiply it by the appropriate factor from the Tukey HSD table in Table 5, 3.71, we find the value for the HSD to be 22.605.
When we take the six pairs of counties and compute the difference in average sales within each pair, only one pair exceeds the HSD: Echo and Lima. The difference in their average sales, 35.625, can be considered real. This suggests we should focus our attention on these two counties. What if the county level is not detailed enough to suggest useful insight? We can drill down further to the eight towns.
Tukey’s HSD procedure for the eight towns
When we repeat Tukey’s HSD procedure for the eight towns, things grow more complicated. Instead of six pairs of counties, we have 28 pairs of towns. The average sales by town appear in Online Table 2. The standard error of the mean is 8.586, and the appropriate factor from the HSD table is 4.40. The HSD for the towns is 37.778.
Of the 28 pairs of towns, only two test really different at the 5% level of risk: Sycamore and Maple, and Walnut and Maple. Maple is the town to focus on in Lima County. For Echo County, both of its towns are worth our attention. If sales can be broken out below the town level, we might drill down a third time to see what that reveals.
Some of the possible factors that might be in play behind the difference in sales are variations in income, customer proximity to stores, the timing and implementation of promotions, and the mix of inventory available for sale.
The results of Tukey’s HSD procedure suggest some practical limits to the drilling-down strategy. The more levels we drill down, the large number of groups there are to manage.
At some point, the number of observations in each group is so small that we lose the ability to detect true differences between them. In general, when dealing with more than a handful of groups, we need a computer to carry out the strategy so we can obtain answers in a reasonable period of time.
When we reach the limits of what drilling down can show, a useful next step is to collect additional data about the groups that test as really different and to apply other analytical strategies for insight.
Reference and Note
- For more information about ANOVA, see Joseph D. Conklin’s "DOE and Six Sigma," Quality Progress, March 2004, pp. 66-69, and reference 2.
- George Box, William Hunter and J. Stuart Hunter, Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building, John Wiley & Sons, 1978.
- A good reference for multiple comparison procedures is Larry E. Toothaker’s Multiple Comparison Procedures (Series: Quantitative Applications in the Social Sciences), Sage Publications Inc., 1993.
- Ibid. For more information about the Tukey HSD assumptions and how to adjust the procedure, see chapter four of Toothaker’s Multiple Comparison Procedures (Series: Quantitative Applications in the Social Sciences).
- In drilling down to the counties and towns, I do not mention performing an ANOVA, the third step of Tukey’s HSD procedure listed in Table 2. This omission was to save space in the body of the article.
After the ANOVA for the territories strongly suggested real differences in average sales, it should be expected that an ANOVA for the next few lower levels of geography should show a similar result.
For completeness, the ANOVA for counties and towns are shown in Tables 6 and 7, respectively. They confirm the results of the ANOVA for territories.
Joseph D. Conklin is a mathematical statistician in Washington, D.C. He earned a master’s degree in statistics from Virginia Tech in Blacksburg and is a senior member of ASQ. Conklin is also an ASQ-certified quality manager, quality engineer, quality auditor, reliability engineer and Six Sigma Black Belt.