What a Bubble Plot Can Do for You

by Tony Gojanovic

As an industrial statistician, I believe the majority of insights into manufacturing problems can be gleaned from simple graphical displays. One exploratory graphical tool with high potential for revealing patterns in data is the bubble plot.

The bubble plot is similar to the scatter plot in which data are plotted on a two-dimensional x and y axis coordinate system. The difference is that a third data factor (z) controls the size of the scatter points. For example, if I use a scatter plot to illustrate the relationship between gas mileage (y) and weight (x) for various automobiles, the size of the scatter points might be determined by the cost (z) of the automobiles. Bubble plots use circles of varying sizes to summarize data in which the radius (r) of each circle is proportional to the size of the data value (z).

In the beverage container lid manufacturing industry, a key element is metal exposure. When a lid or pop top is formed by running a coated aluminum sheet through a stamping process, bare metal may be exposed through a fracturing of the protective coating on the lid. Metal exposure is destructively tested by placing the finished lid on a special holder with a saline solution touching the lid surface that will be exposed to the beverage and inducing an electrical current through the solution. A probe on the other side of the lid detects any current.

If there is no metal exposure, the lid will not conduct electricity, and the reading will be near zero. An acceptable specification limit exists for electrical conductivity or metal exposure that may produce an off tasting beverage when exceeded.

Factors that affect metal exposure are the integrity of the coated metal and the quality of the forming tools used in the stamping process.

To analyze metal exposure data, a natural log transformation is applied to bring symmetry due to skewness induced by a lower bound of zero. Outliers are not uncommon, and the median, X~, and median absolute deviation (MAD) are used instead of the mean and standard deviation for robust measure of location and spread, respectively. The median is the middle point for a set of ordered data, and the MAD is

Figure 1 shows an exploratory analysis using the bubble plot for median metal exposure based on natural log transformed data. The graph was created with Splus statistical software but could also have been created with SAS/Graph graphing module. The sizes of the bubbles correspond to the median of process checks for a particular week of a lid forming machine.

A run chart might have been constructed for each machine by week but would look too busy on one sheet of paper. The bubble plot provides a simpler interpretation, in which the magnitude of each data value is incorporated into the bubble itself. Shading that correlates with the size of the bubbles further enhances the user’s ability to find patterns. For example, the darker and larger the circle, the higher the median.

Each machine has three outputs or lanes (A, B and C). The red box delineates the two machines and their subsequent lanes to indicate a correlation
structure exists between outputs. If something happens to the machine that makes the forming tools press harder into the metal, it may affect all three lanes, as in the case with machine one in week 15.

On the other hand, lane B of machine two indicates a tooling change—lower readings following higher readings caused by replacement tools with a more appropriate surface finish. If all machines and all subsequent lanes simultaneously exhibit an increase, however, there’s likely a problem with the incoming material.

Bubble plots can be used to identify machine and tooling problems and to manage suppliers by showing systematic plantwide trends caused by incoming materials in one condensed graphical format. However, the bubble plot interpretation is simply exploratory. Once patterns of interest are noted, a confirming approach, such as a control chart, can then be pursued.


  1. Everitt, B.S., The Cambridge Dictionary of Statistics, Cambridge University Press, 1998.
  2. Hoaglin, David C., Frederick Mosteller and John W. Tukey, Understanding Robust and Exploratory Data Analysis, John Wiley & Sons, 1983.

TONY GOJANOVIC is the performance metrics and statistics manager at Coors Brewing Co. in Golden, CO. He earned a master’s degree in statistics from the University of Colorado-Denver. Gojanovic is a member of the American Statistical Association and ASQ.

Average Rating


Out of 0 Ratings
Rate this article

Add Comments

View comments
Comments FAQ

Featured advertisers