How To Analyze Reliability Data For Repairable Products
by Necip Doganaksoy, Gerald J. Hahn and William Q. Meeker
Leveraging powerful—yet simple—methods for reliability data analysis of repairable products or systems can help you stay on the right track. Manufacturers of products like locomotives, automobiles, computers and even washing machines use these analytical methods to:
- Evaluate the impact of removing a particular failure mode through product redesign.
- Assess alternative warranty strategies.
- Develop advertising claims about their product vs. the competition’s product.
- Establish an optimum maintenance schedule.
- Compare the performance of a current product to previous designs.
We will describe an approach for addressing such activities.
Repair Recurrence Data
Typically, data on repairable products are obtained through field, repair or warranty information. Repairable products—unlike nonrepairable ones—can lead to multiple event (such as repair) times on the same unit, or recurring events, resulting in recurrence data.
Recurrence data require special methods of analysis. Even though this article focuses on product repairs, the method described here also can be used in the analysis of recurrence data encountered in other applications such as:
- Customer purchase behavior over time.
- Repair costs.
- Disease recurrences.
The statistical distributions of the time between subsequent repairs may be quite different from that for the time to the first repair. A simple distribution, such as the Weibull or the lognormal, is frequently fitted to life (or time of first failure) data. But this is usually not appropriate for recurrence data. Fortunately, a simple method is available for analyzing such data. This method, moreover, does not require any assumptions about independence or the shape of the recurrence rate function over time.
Consider the example of a manufacturer of locomotives.1 This manufacturer received separate orders from a railroad for 15 and 18 locomotives, respectively.
The locomotives were built about one year apart. The locomotives from the first order experienced about 700 days of service in the field compared to the 400 days for those from the second order.
Each locomotive has six braking grids. To slow the locomotive, these brakes are used in place of the power input to the electric motor. The motor then begins to act as a generato r, powered by the inertia of the locomotive. The electrical resistance of the braking grid provides a load to the motor. This electrical load translates into mechanical resistance for the motor and, in turn, to the wheels connected to it. This causes the locomotive to decelerate.
The braking grids for the separate orders came from different production lots manufactured about a year apart. For each of the 33 locomotives, records were maintained of the times (measured in days of operation) when the braking grids were repaired.
With the collected data, the reliability performance of the braking grids from the two production lots can be characterized and compared.
The repair times for each locomotive are
plotted in an event plot in Figure 1 (p. 93). Each line in this
plot shows the history of a locomotive. Each X on a line is the
locomotive’s age in days in service when a braking grid was
repaired. The length of each line tracks each locomotive’s
length of service. For example, the 15th locomotive from order
one has been in service for 657 days. Its braking grids were
repaired after 317 and after 498 days in service,
In the first 400 days of service, the 10 locomotives from order two experienced a higher rate of braking grid repairs than those from order one. Specifically, there were a total of 26 failures on 18 units built for order two vs. 10 failures on 15 units built for order one.
The purpose of the analysis presented below is to further quantify such differences.
The MCF Plot
Many questions about the reliability of a repairable product (the locomotives in our example), based upon field repair recurrence data, can be answered by estimating its mean cumulative function (MCF). The MCF of a product at age t is defined as the average number of failures per unit up to time t.
In the example, the number of locomotives running at a particular time remains about the same (or almost the same) over time for each of the two orders. Thus, a simple estimator of the MCF at age t is the sum of all repairs by age t divided by the total number of units in service. In this example, a total of five repairs occurred within the first 300 days for the 15 locomotives (or systems) for order one.
Accordingly, an estimate of the MCF at 300 days is:
5 repairs ÷ 15 locomotives = 0.33 repairs per locomotive.
The sample estimates for the MCF then are plotted against product age. This plot shows whether the recurrence rate (repair rate in the braking grid application) is increasing (MCF increases at an increasing rate), decreasing (MCF increases at a decreasing rate) or staying relatively constant (MCF is increasing approximately linearly) over time.
MCF Plot for Order One
Figure 2 shows a plot (the center line) of the
estimated MCF against days in service for the braking grids for
order one. This plot also shows (pointwise) approximate 95%
confidence intervals (the two outer lines) around the
Figure 2 shows there were few failures before 250 days for the locomotives built in order one. After 250 days in service, the repair rate for these locomotives appears to have increased sharply and thereafter remained relatively constant over time.
Comparing The Two Orders
Figure 3 plots the estimated MCF against days
in service of the braking grids for both locomotive orders. These
plots show a clear differentiation in the estimated MCFs between
the two orders. Braking grid repairs tend to occur earlier in
order two than in order one.
Table 1 shows MCF estimates and approximate 95%
confidence intervals after one year in service. The confidence
limits do not overlap, suggesting the difference at this age is
statistically significant. A formal analysis showed a
statistically significant difference between the MCFs of the two
orders for most ages.3
These results could be attributed to differences in the product, the field use environment or a combination of both. The usage environments were similar for both orders. Between the times the locomotives in the two orders were manufactured, the supplier changed the die that cut the braking grids. This suggests a root cause for the differences that needs to be addressed.
In the locomotive example, further analysis would have been possible had more detailed information been gathered. For instance, individual grid brake identifications, and the positions of the repaired braking grids in the locomotive, may have helped determine whether any positions were especially vulnerable. Such data also may have permitted analyses of the times to repair for the individual braking grids (as a nonrepairable product). More extensive details would be recorded and maintained in the future.
MCF Software Options
In estimating the MCF, the exposure time could vary appreciably from unit to unit within orders—unlike in our locomotive example. This may be because of staggered entry of units into the field or differences in use rates. As a result of such variations, the number of systems being observed varies during the period of observation.
In these cases, more complicated calculations are required to estimate the MCF.4,5 More of such general methods are, in fact, used in the software packages that permit analyses of recurrence data. Such offerings include:
- SAS QC PROC Reliability—http://support.sas.com/rnd/app/qc/qc.html.
- SPLIDA add-on to S-PLUS—www.public.iastate.edu/~splida.
- Weibull++ — www.reliasoft.com/Weibull (case sensitive).SAS, JMP and Minitab are general purpose packages. SPLIDA and Weibull++ specialize in reliability analyses and modeling.
- Necip Doganaksoy and Wayne B. Nelson,
“A Method to Compare Two Samples of Recurrence Data,”
Lifetime Data Analysis, Vol. 4, No. 1, 1998,
- Wayne B. Nelson, Recurrent Events Data Analysis for Product Repairs, Disease Recurrence and Other Applications (ASA-SIAM Series on Statistics and Applied Probability), American Statistical Assn., 2003.
- Doganaksoy and Nelson, “A Method to Compare Two Samples of Recurrence Data,” see reference 1.
- Nelson, Recurrent Events Data Analysis for Product Repairs, Disease Recurrences and Other Applications (ASA-SIAM Series on Statistics and Applied Probability), pp. 37-43, see reference 2.
- William Q. Meeker and Luis A. Escobar, Statistical Methods for Reliability Data, John Wiley & Sons, 1998, pp. 397-398.
NECIP DOGANAKSOY is a statistician and Six Sigma Master Black Belt at the GE Global Research Center in Schenectady, NY. He has a doctorate in administrative and engineering systems from Union College in Schenectady. Doganaksoy is a Fellow of the American Statistical Assn. and a Senior Member of ASQ.
GERALD J. HAHN is a retired manager of statistics at the GE Global Research Center in Schenectady, NY. He has a doctorate in statistics and operations research from Rensselaer Polytechnic Institute in Troy, NY, where he is also an adjunct faculty member. Hahn is a Fellow of the American Statistical Assn. and ASQ.
WILLIAM Q. MEEKER is professor of statistics and distinguished professor of liberal arts and sciences at Iowa State University in Ames, IA. He has a doctorate in administrative and engineering systems from Union College in Schenectady, NY. Meeker is a Fellow of the American Statistical Assn. and a Senior Member of ASQ.