## 2020

**Using Degradation Data
For Product Reliability Analysis**

#### **A case study shows how this type of data can
provide**

more precise results in assessing reliability

more precise results in assessing reliability

*by *William Q. Meeker, Necip Doganaksoy
and Gerald J. Hahn

This is the third installment in a series of articles on reliability improvement and data analysis. The first article, "Reliability Improvement: Issues and Tools" ran in Quality Progress in May 1999. QP published the second installment, "Product Life Data Analysis: A Case Study" in its June 2000 issue.

High reliability systems require individual components to have extremely high reliability for a long time. Often, the time for product development is short, imposing severe constraints on reliability testing.

Traditionally, methods for the analysis of censored failure-time
data are used to extrapolate mission reliability from the longest
test times--even though there may be few observed failures.^{1}
This significantly limits the accuracy and precision of the conclusions,
motivating us to search for better methods.

Many failure mechanisms can be traced to an underlying degradation process. Degradation eventually leads to a reduction in strength or a change in physical state that causes failure. Degradation measurements, when available, often provide more information than failure-time data for assessing and improving product reliability.

This article provides a brief introduction on how one can, in
some situations, leverage degradation data for reliability prediction
and improvement, and it presents a simple method for analyzing
such data. See Statistical Methods for Reliability Data^{2}
and Applied Reliability^{3} for more
details, examples and references.

In some studies (on tire wear, for example), degradation over time is measured directly. In other cases, degradation cannot be observed directly, but measures of product performance degradation (such as power output) are available. Sometimes, degradation is measured continuously. In other applications, measurements become available only at discrete inspection times. In any case, the advantages of using degradation data are considerable (see the sidebar "Advantages of Using Degradation Data" on p. 62).

**Case study
**In some applications, one deals with hard failures,
resulting in a complete loss of functionality--when the filament
in a light bulb fails, for example. At other times, failures are
soft, occurring when a critical performance measurement reaches
a predefined level. The component continues to function, but unsatisfactorily.
As Wayne Nelson suggests, although the definition of failure is
often arbitrary, it should be meaningful.

^{4}

Let's consider a study adapted from William Q. Meeker and Luis
A. Escobar,^{5} which involves a gallium
arsenide (GaAs) laser for telecommunications systems. As the device
ages, more current is required to obtain the required light output.
The device has a built-in feedback circuit to maintain constant
light output. A unit is defined to have failed at the time at
which a 10% current increase is first needed.

Fifteen lasers were run on life test at the accelerated temperature of 80° C ambient for 4,000 hours. By this time, three lasers had failed--at 3,374 hours, 3,521 hours and 3,781 hours--using the preceding failure definition.

The product needed to operate for at least 200,000 hours over 20 years at a temperature of 20° C. From experience, the engineers conservatively estimated that the 80° C test would provide an acceleration factor of approximately 40 in time to failure. To allow for the needed redundancy, an estimate of the probability of failure at 200,000/40 = 5,000 hours (equivalent to over 20 years in operation) was desired.

All analyses were conducted using the SLIDA collection of S-Plus
functions.^{6} Other software packages
are referenced later.

The lognormal distribution was felt, from experience, to be an appropriate time to failure model. Figure 1 is a lognormal probability plot of the data, showing the three failures. The 12 unfailed units at 4,000 hours are shown on the top of the plot. The straight line is the maximum likelihood (ML) estimate of F(t), the probability of failure by time t, using traditional methods (based on estimates of the distribution parameters µ and ).

This fit takes into account the 12 unfailed units (explaining
why the ML line does not seem to fit the plotted points and why
maximum likelihood, rather than simple linear regression, was
used). Also shown are 95% confidence limits on the fitted line.
See our previous article "Product
Life Data Analysis: A Case Study"^{7}
for an introductory discussion of these methods.

The estimate of F(5,000) is 0.658 with an approximate 95% confidence interval of 0.126 to 0.962. This extremely wide interval reflects the fact that the analysis is based on a small number of failures.

**Estimation of 5,000-hour failure probability
from 4,000-hour degradation data
**The
preceding analysis did not utilize the power output measurements,
except in the go/no-go sense of calling a failure when the current
increase exceeded 10%. Figure
2 (p. 60), a plot of the degradation data at 4,000 hours,
seems to provide additional useful information. For example, the
degradation path of one of the units suggests it was close to
failing by 4,000 hours. This added information is leveraged in
the following analysis.

The basic approach is to generate pseudofailure times for unfailed units by extrapolating their degradation paths and including these in the analysis. Figure 3 shows simple linear regression lines fitted to the degradation paths of the 12 unfailed devices extrapolated to 5,000 hours.

By this time, the extrapolated degradation for three added units exceeded a 10% increase in current, resulting in pseudofailure times of 4,194 hours, 4,721 hours and 4,995 hours, in addition to the three failures prior to 4,000 hours. The other nine units remained censored.

Figure 4 is a lognormal probability plot of the six failure times (three observed failures and three pseudofailures), with nine censored units, at 5,000 hours.

The ML estimate of the probability of failing by 5,000 hours (based on a lognormal fit to the data) is F(5,000) = 0.410 with an approximate 95% confidence interval of 0.197 to 0.657. Although this interval is still quite wide, it is much narrower than that from the failure-time analysis. (This interval, however, does not incorporate the added variability due to uncertainty in extrapolating from the degradation data.)

The censoring time of 5,000 hours to generate pseudofailures was chosen to minimize extrapolation. A further analysis, not shown here, allowing all devices to "fail" yielded a similar estimate but with a shorter confidence interval.

**Estimation of 5,000-hour failure probability
from 2,000-hour degradation data
**To illustrate an important advantage of degradation
data analysis, let's analyze the data available after only 2,000
hours. This shorter test could allow an earlier release of a reliable
product and speedier corrective action on an unreliable one. However,
it requires more extrapolation and reliance on the assumed linear
degradation model. Since there were no failures at 2,000 hours,
standard failure-time analysis is not possible (although an upper
confidence bound on the failure probability at 2,000 hours can
be obtained using binomial distribution methods).

Figure 5 (p. 64) shows the 2,000-hour degradation paths extrapolated to 5,000 hours. Seven paths now exceed a 10% current increase, resulting in pseudofailure times of 3,229, 3,514, 3,742, 4,047, 4,282, 4,781 and 4,969 hours. The eight remaining units continue to be censored at 5,000 hours.

Figure 6 (p. 64) is a lognormal probability plot of the pseudofailure times from the 2,000-hour degradation data. The ML estimate of the probability of failing by 5,000 hours is F(5,000) = 0.475 with an approximate 95% confidence interval of 0.248 to 0.712.

The results of this analysis did not differ much from those of the 4,000-hour analysis. The device did not meet reliability requirements (even if one, optimistically, used the lower confidence bound on the failure probability) and required redesign. However, this conclusion was reached 2,000 hours earlier than in the conventional failure-time analysis--an important practical advantage!

**Limitations of degradation data analysis
**Degradation data analyses need to be conducted cautiously,
recognizing the underlying assumptions. In our example, the degradation
paths were well-behaved, with little measurement error, allowing
pseudofailure times to be reasonably extrapolated. Not all degradation
processes are that simple. Other models and/or analysis methods
are needed if:

1. The sample paths are not linear (and cannot be transformed to become linear) or cannot be expected to be reasonably linear in extrapolation (what is "reasonably linear" depends on the degree of extrapolation).

2. There is substantial measurement error, causing the pseudofailure times to differ appreciably from the actual (unrealized) failure times.

3. Failures occur suddenly (instantaneous increases in degradation) with little correlation to degradation. Such behavior frequently indicates catastrophic failure due to a mechanism different from the measured degradation, implying that the degradation data provide little information about time to failure.

In some applications (such as disassembling a motor to measure wear), the degradation measurement itself impacts future degradation, or is destructive. This allows only a single degradation measurement at a strategically selected time on each unit.

In addition, pseudofailure times are not actual failure times. If the fitted lines do not provide a good extrapolation to the actual (un-known) failure times, the analysis could be badly biased. This danger is especially great if there is much extrapolation.

**Some technical comments
**Another way of expediting results is to conduct accelerated
life tests, based on an appropriate physical model. One can also
combine the two approaches--obtaining degradation data from accelerated
tests.

Simple degradation analyses can be implemented by standard statistical
methods (simple regression to estimate pseudofailure times and
maximum likelihood fitting of the resulting censored failure-time
data), using statistical packages such as SAS,^{8}
Minitab^{9} and S-Plus.^{10}
The SLIDA collection of S-Plus functions has built-in functions
to facilitate fitting separate regression lines for each unit.
Weibull++^{11} offers automated features
to conduct similar analyses.

When degradation is well-behaved and measurement error is small,
the simple method described here is often adequate. Meeker and
Escobar^{12} describe a more sophisticated
analysis method that accounts for measurement error (without explicitly
predicting failure times for unfailed units).

**Leveraging well-behaved degradation data
**Well-behaved degradation data can provide more precise
reliability estimates than times to failure alone and can permit
extrapolations without failures. This allows one to draw tentative
conclusions earlier--often an important practical advantage. This
article has focused on statistical methods for leveraging well-behaved
degradation data.

Our ability to perform such analyses makes the mainly nonstatistical task of identifying a well-behaved degradation measurement--and one that is a true precursor of failure that can be readily obtained--of paramount importance.

**REFERENCES **

1. Necip Doganaksoy, Gerald J. Hahn and William Q. Meeker, "Product Life Data Analysis: A Case Study," Quality Progress, June 2000.

2. William Q. Meeker and Luis A. Escobar, Statistical Methods for Reliability Data (New York: John Wiley & Sons, 1998).

3. Paul A. Tobias and David C. Trindade, Applied Reliability, second edition (New York: Van Nostrand Reinhold, 1995).

4. Wayne Nelson, Accelerated Testing: Statistical Models, Test Plans and Data Analyses (New York: John Wiley & Sons, 1990).

5. Meeker and Escobar, Statistical Methods for Reliability Data (see reference 2).

6. William Q. Meeker, SLIDA S-Plus Life Data Analysis Functions and Graphical User Interface, www.public.iastate.edu/~stat533/slida.html.

7. Doganaksoy, Hahn and Meeker, "Product Life Data Analysis: A Case Study" (see reference 1).

8. SAS/STAT User's Guide: Release 6.03 Edition (Cary, NC: SAS Institute, 1988).

9. Minitab User's Guide 2: Data Analysis and Quality Tools, Release 12 (State College, PA: Minitab, 1997).

10. S-Plus User's Manual, Version 2000 (Seattle: Statistical Sciences, 1999).

11. Life Data Analysis Reference-Weibull++ (Tucson, AZ: ReliaSoft Publishing, 1997).

12. Meeker and Escobar, Statistical Methods for Reliability Data (see reference 2).

BIBLIOGRAPHY

Hahn, Gerald J., Necip Doganaksoy and William Q. Meeker, "Reliability Improvement: Issues and Tools," Quality Progress, May 1999.

**WILLIAM Q. MEEKER** *is professor of statistics and distinguished
professor of liberal arts and sciences at Iowa State University,
Ames. He obtained a doctorate in administrative and engineering
systems from Union College in Schenectady, NY. He is an ASQ member.*

**NECIP DOGANAKSOY** *is a statistician at GE Corporate
Research and Development in Schenectady, NY. He obtained a doctorate
in administrative and engineering systems from Union College in
Schenectady, NY. He is an ASQ member.*

**GERALD J. HAHN** *is recently retired manager of applied
statistics at GE Corporate Research and Development in Schenectady,
NY. He holds a doctorate in statistics and operation research
from Rensselaer Polytechnic Institute in Troy, NY. He is an ASQ
Fellow.*

Advantages of Using Degradation Data

Some advantages of using relevant degradation data in reliability analyses over, or in addition to, traditional failure-time data, are:

- More informative analyses--especially when there are few or no failures.
- Useful data often become available much earlier.
- Degradation, or some closely related surrogate, may allow direct modeling of the mechanism causing failure, provides more credible and precise reliability estimates and establishes a firm basis for often needed extrapolations in time or stress.

In addition, degradation data may increase physical understanding, and, thereby, enable earlier rectification of reliability issues.

If you would like to comment on this article, please post your remarks on the Quality Progress Discussion Board, or e-mail them to editor@asq.org.

Featured advertisers