BACK TO BASICS
Is There a Relationship Here?
A scatter diagram can help you determine if there's a correlation between two variables
by ASQ's Quality Management Division
A scatter diagram is a chart in which one variable is plotted against another to determine whether there is a correlation between the two. These diagrams are used to plot the distribution of information in two dimensions and are useful in rapidly screening for a relationship between two variables.
A scatter diagram shows the pattern of relationship between two variables that are thought to be related. For example, is there a relationship between outside temperature and cases of the common cold? As temperatures drop, do colds in-crease? The more closely the points hug a diagonal line, the more surely there is a one-to-one relationship.
The purpose of a scatter diagram is to display what happens to one variable when another variable is changed. The slope of the diagram indicates the type of relationship that exists.
Figure 1 shows a plot of two variables—predicted values vs. ob-served values. As the predicted value increases, so does the actual measured value. These variables are said to be positively correlated; that is, if one increases, so does the other. The line plotted is a regression line, which shows the average linear relationship between the variables.
If the line in a scatter diagram has a negative slope, the variables are negatively correlated; that is, when one increases, the other decreases, and vice versa. When no regression line can be plotted and the scatter diagram appears to be simply a ball of diffuse points, then the variables are said to be uncorrelated.
The utility of a scatter diagram for quality assessment lies in its measurement of variables in a process to see whether any two or more variables are correlated or uncorrelated. The specific utility of finding correlations is to infer causal relationships among variables and ultimately find the root causes of problems.
The basic steps involved in constructing a scatter diagram are as follows:
Define the x variable on a graph paper scatter diagram form. This variable is often thought of as the cause variable and is typically plotted on the horizontal axis.
Define the y variable on the diagram. This variable is often thought of as the effect variable and is typically plotted on the vertical axis.
Number the pairs of x and y variable measurements consecutively. Record each pair of measures for x and y in the appropriate columns. Make sure the x measures and corresponding y measures remain paired so the data are accurate.
Plot the x and y data pairs on the diagram. Locate the x value on the horizontal axis; then locate the y value on the vertical axis. Place a point on the graph where these two intersect.
Study the shape formed by the series of data points plotted. In general, conclusions can be made about the association between the two variables (x and y) based on the shape of the scatter diagram. Scatter diagrams that display associations between two variables tend to look like elliptical spheres or even straight lines.
Scatter diagrams on which the plotted points appear in a circular fashion show little or no correlation between x and y.
Scatter diagrams on which the points form a pattern of increasing values for both variables show a positive correlation; as values of x increase, so do values of y. The more tightly the points are clustered in a linear fashion, the stronger the positive correlation, or the association between the two variables.
Scatter diagrams on which one variable increases in value while the second variable decreases in value show a negative correlation between x and y. Again, the more tightly the points are clustered in a linear fashion, the stronger the association between the two variables.
If there appears to be a relationship between two variables, they are said to be correlated. Both negative and positive correlations are useful for continuous process improvement.
Scatter diagrams show only that a relationship exists, not that one variable causes the other. Further analysis using advanced statistical techniques can quantify how strong the relationship is between two variables.
NOTE: This column is adapted from The Quality Improvement Handbook by ASQ's Quality Management Division and edited by John E. Bauer, Grace L. Duffy and Russell T. Westcott (Milwaukee: ASQ Quality Press, 2002).