ONE GOOD IDEA

## Ishikawa diagrams help select the correct statistical test

by Matthew Barsalou

The Selection of the correct statistical test is a critical element when conducting statistical analysis––plugging numbers into the wrong test will deliver the wrong answer. The consequences of an incorrect conclusion could be as simple as mistaking process A to be better than process B and lead to a financial loss. Or it could be as critical as determining a medicine to be better than a placebo and result in loss of life because a medicine that doesn’t work was used in place of one that does.

Assistance in test selection is available in the form of decision trees and reference books with descriptions of various tests. The Certified Six Sigma Black Belt Handbook1 uses flowcharts as a decision tree for statistical test selection. Other books such as Engineering Statistics2 provide tables for statistical test selection. Quality Engineering Statistics3 simply describes the statistical tests and their uses within the text.

Another way to determine which test to use is via the Ishikawa diagram, also known as a cause-and-effect diagram or fishbone diagram (see Figure 1). This quality tool is used for displaying the relationship between an effect and possible causes of an effect. The same concept also can be used to display possible statistical tests that can be used to achieve the effect––a correct hypothesis test.

An Ishikawa diagram divides statistical tests into three categories, or branches, based on the type of data that will be analyzed: attribute data, parametric continuous data and nonparametric continuous data. Attribute data, also called discrete data, have "categories that can take on only certain values."4 Such data include the number of parts found to be defective or the number of employees absent on a specific day. There can be eight defective parts, but there can’t be 8.5 defective parts.

Continuous data, also called variable data, are data that "can take on any one of an infinite number of values within a given range."5 Continuous data include the length of a part or the weight of a bag of material, as well as numbers such as 124 kilograms, 14 degrees or 37 millimeters, and can be parametric or nonparametric.

Parametric data depend on assumptions regarding the parameter estimation, such as the shape of the distribution. A parametric statistical test will lead to false conclusions if the assumptions are violated. Nonparametric tests are distribution free6 and do not assume that the data have a normal distribution.

Each of the three branches on the diagram has lower-level branches based on the type of test needed. This is determined by the number of samples and the property being evaluated, such as variance, mean, median or proportions. Below the lower-level branches are more branches listing the actual tests that can be used for the given conditions.

An advantage of an Ishikawa diagram approach to selecting statistical tests is the speed and ease at which the correct test can be selected. The tester can quickly zero in on the correct test after the data type, distribution, property being evaluated and sample size are known. The entire concept can be displayed on one page, and the tester no longer needs to go fishing through reference books to identify the correct procedure.

### References

1. Tom M. Kubiak and Donald W. Benbow, The Certified Six Sigma Handbook, second edition, ASQ Quality Press, 2009.
2. Douglas C. Montgomery, George C. Runger and Norma F. Hubble, Engineering Statistics, second edition, John Wiley and Sons, 2001.
3. Robert A. Dovich, Quality Engineering Statistics, ASQ Quality Press, 1992.
4. Allan G. Johnson, Statistics, Harcourt Brace Jovanovich, Publishers, 1988, p. 22.
5. Ibid.
6. John F. Early and Brian A Stockhoff, "Accurate and Reliable Measurement Systems and Advanced Tools," in Joseph M. Juran and Joseph A. Defeo, Juran’s Quality Handbook, sixth edition, McGraw-Hill, 2010.

Matthew Barsalou is a statistical problem resolution master black belt in the global engineering excellence department at BorgWarner Turbo Systems Engineering GmbH in Kirchheimbolanden, Germany. He has a master’s degree in business administration and engineering from Wilhelm Büchner Hochschule in Darmstadt, Germany, and a master’s degree in liberal studies from Fort Hays State University. He is an ASQ-certified quality technician, Six Sigma Black Belt and quality engineer. He is also an ASQ senior member, a technical reviewer for Quality Progress, editor of the Statistics Division’s Statistics Digest and the ASQ country counselor for Germany.

A trivial point: While I appreciate the organization of the statistical tests by data and statistic type, this really is a tree diagram and not a fishbone diagram. Just because it is shaped like fishbone doesn't make it an ishikawa diagram. In fact if it were put into a standard tree diagram is might be easier to read.
A couple of other points: a paired t-test is NOT used for a two sample test of means with unequal variances. a paired t test is used when a matched pair study design is used. There is a formula for a t-test when unequal variances exist between the groups...
For homogeneity of variance both Bartlett's and Levene's tests work for continuous data. Bartlett's test is very sensitive to non-normality of the data and Levene's is insensitive. For example if the data fail Bartlett's test for homogeneity it may simply be that the data are non-normal even though the variances are homogenous. They are not necessarily segregated into use for parametric or non-parametric analysis: they can indicate which analysis type should be used based on their results...
--Bev Daniels, 03-14-2015

Out of 1 Ratings