Q: My organization manufactures soft ferrite cores, which involves a typical process consisting of variables in many departments, such as pressing, sintering and grinding. The rejection rate is high—about 10%—and some of the variables are not in control. Then, if we control one, another is disturbed.
Because there are so many variables to control, it’s difficult to analyze the root cause of the problem. We can correlate two or three variables, but not all. Please help me find a way to overcome this.
New Delhi, India
A: Pardon the cliché, but this is a classic good news/bad news problem. First, the bad news: This is not a simple problem, so simple solutions may not work. But you already know that. The good news is the failure rate is 10%. How can that be good news? You have a significant opportunity, so it will be easier to make a significant difference.
Ferrites are ceramic compounds that include iron as the principal ingredient and other materials, such as zinc, nickel or manganese. The materials are milled into fine powders, mixed, pressed into a mold and sintered (baked).
The sintering oven usually has an atmosphere capable of controlling the chemical reaction and preventing unwanted contaminants. The sintered core may be ground to finalize the dimensions and perhaps adjust the electrical properties.
The finished product can be used in transformers, solenoids, power supplies and even loudspeakers. Ferrites have dozens of magnetic and electrical characteristics, but the critical performance measures depend on the application.
You mentioned that some of the variables are not in control, and as soon as you control one, another is disturbed. My guess is that you have multiple failure modes. If the failures are intermittent, it may seem as though you are chasing a moving target.
Sometimes, "intermittent" failures are quite predictable. I used to work on a spot-welding process with intermittent "cold" welds. Most of the time, all of the welds were good. But every so often, a high percentage of welds would fail. It turned out there were too many machines drawing from the same power source, and on the rare occasion when all of the equipment demanded power at the same time, the voltage would drop, and cold welds would result.
I suggest you start by collecting as much historical data as you can. Analyze the data, and look for patterns between inputs and failures. There is no guarantee this method will work, but you might get lucky and find some opportunities. The key is to use the right tools for this analysis.
Correlation can be helpful, but it only works well if both metrics have variable data. For example, if parts are rejected for insufficient flux density, try correlating the flux density to the raw material particle size, mixing time, oven temperature and other variable inputs.
Correlation is subject to pitfalls, and correlation coefficients can be inflated by outliers (see Figure 1). In addition, you may miss important relationships if you rely solely on correlation coefficients to identify problems (see Figure 2). For these reasons, correlation analysis always should be supplemented by scatter plots.
Another limitation of correlation is it cannot detect interactions. The process output may be fine if any single process variable drifts to the specification limit, as long as the other process factors stay at the nominal setting.
But processes can be complex. Imagine a two-by-two matrix with two key process variables. If your process has an interaction, then three of the four quadrants in the matrix will have low failure rates, but one of the quadrants will generate the majority of the failures. With a little manipulation of the historical process data, you could create categories for each variable (either low or high) and then compare the failure rates for each combination of factor settings (low-low, low-high, high-low and high-high).
To minimize the failures, you will need to restrict the operating range of at least one of the process variables. If all four quadrants have a similar failure rate, try some other process factors.
Charts and tools
For failures related to attributes such as cracks, voids and burrs, correlation may not work. In this case, try bar charts. Compare the failure rates by shift (day vs. night), mold number or position in the oven (top, middle or bottom). Large differences in the failure rate will help you isolate the problem. Try this technique with variables you have not yet considered, such as lot changes in the raw materials or the number of production runs since the most recent mold refurbishment.
Also consider trend charts. Some people limit investigations to the specific lot that failed. But a trend chart may show that a critical process average shifted a week before the failure. The failure was more likely to occur after the process shift, but you may need a combination of factors to occur simultaneously before the failure happens.
Don’t forget standard tools such as fishbone diagrams. Structured brainstorming also can generate new insights into potential root causes. A process failure mode and effects analysis can identify the system’s potential weaknesses, such as poor detection.
It is always better to prevent failures, but if you can detect them at an intermediate process step, you can react faster and prevent defects from progressing to the next step of production. If you already tried these tools, simply review the historical failure data and make sure you have not missed any failure modes.
You also could try a tracking study. For the purpose of this study, measure everything you can think of at every step of the process, including things you don’t normally measure. Track the material and try to maintain the production sequence as the material moves through subsequent operations. This will allow you to look for trends within the lot instead of just looking at variation between lots.
If the failure rate changes from the beginning to the end of the run, start digging deeper. Some mixers have a tendency to sort by particle size, so if the mixer runs too long, all the large particles end up on the top. This may cause problems either at the beginning or the end of the batch, depending on how the mixer is unloaded.
When you think you have found the root cause, prove it. If possible, conduct a series of runs, alternating between the "before" and "after" conditions. If the failure appears and disappears as expected every time you change the process, you can be confident you found the root cause.
It’s also possible to show the worst-case scenario is a 50/50 probability you found the root cause. When the fix is implemented, you expect the failure rate to go down, and when the fix is removed, you expect the failure rate to go back up. If you do this change four times (on, off, on, off) and the process behaves as expected every time, you can be confident you have the root cause.
The probability of being wrong is (0.5)4 = 0.0625. If you do this six times (on, off, on, off, on, off) and the process behaves as expected for all six runs, the probability of being wrong is reduced to (0.5)6 = 0.015625. If any of the runs deviate from your expectation, you probably do not have the root cause, so keep looking.
Be persistent and get other people involved. You will find answers soon enough. If all else fails, get a Six Sigma Black Belt or statistician to help you, and try factorial experiments.
Consultant and Master Black Belt