# Computer Search for the Ideal Histogram

Article

Jamieson, Archibald   (1989, ASQC)   Jamieson Management Services, St. Catharines, Ontario, Canada

Annual Quality Congress, Toronto, Ontario, Canada    Vol. 43    No. 0
QICID: 3580    May 1989    pp. 277-283
List \$10.00
Member \$5.00

New to ASQ? REGISTER HERE.

## Article Abstract

As a teacher of quality control I have, in the past, given students an assignment which required the drawing of a frequency histogram of some data. However, the large variety of responses was such that I had to lay down some very subjective rules in order to achieve consistency. The reasons for the large numbers of variations were:

1. Number of classes, or cells, chosen.
2. Size of class interval. This is not always directly related to the number of classes.
3. The placing of the lowest value:
1. on the lowest class boundary, or,
2. in the center of the lowest class.
4. Counting the values in each class:
1. equal to or greater than the low boundary, or,
2. equal to or less than the high boundary.
Changing any one of these six factors could change the shape of the histogram, so, there is an extremely large number of possible combinations with no way of knowing which one is "correct". Nevertheless, most textbooks give little or no attention to the subject. Some suggest a range within which to select the number of classes and a few used to Sturges' formula. None were found which pointed out the significance of the placement of the lowest value or the counting within each cell.

Drawing histograms by hand is somewhat time-consuming so, a start was made to examining the subject with the aid of a computer. Each of the above variables was examined and then a new approach was adopted. Using the older methodology, the median value, quite often, would fall in a class other than the median class. This did not appear to be logical so a start was made to using the median as the starting point, and working backwards to establish the lowest class boundary. This eliminated both the subjective selection of the place for the lowest value and also the method used for counting.

Knowing the standard deviation of the data, the next step was to compare the class mid-point frequency with the theoretical Normal frequency. The total for all classes was the deviation from Normal of the histogram as a whole. Repeating the process for a number of different classes then made it possible to select the "best possible" histogram with the least deviation from Normal. This has been called the optimum histogram. Such a method of selection, while quite practical with the aid of a computer, would have been unthinkable in the days of manual calculations.

The assumption of Normality is a basic prerequisite to the interpretation of mean and range charts and the Optimum-Histogram method simply extends this assumption to the analysis of the frequency distribution of values. It provides a quick means by which samples may be compared on the basis of their proximity to Normality when in a grouped-frequency histogram. It reduces the risk of a sample from a Normal population being regarded as non-Normal.

## Keywords

Automobile industry

Browse QIC Articles Chronologically:     Previous Article     Next Article

New Search