ONE GOOD IDEA
Assessing comparability based on limited data
by Keith M. Bower and Abraham Germansderfer
Limited data availability complicates an assessment of whether two populations are comparable. Historically, comparability is determined using a variety of techniques, including equivalency of means and variances, and—often incorrectly—Student’s two-sample t-test.1 But limited data greatly reduces the power of these methods, so an alternative method for demonstrating comparability is required.
Statistical equivalency tests,2 such as two one-sided t-tests (TOST), are widely accepted as a way to demonstrate comparability. The amount of data collected should ensure the test is adequately powered. When limited data are available, TOST may be unable to declare equivalency even when the two population means are equal.
As an alternative approach, a statistical tolerance interval (TI) can be used to set the comparability criteria.3 TI calculations are typically available in statistical software packages and discussed in most introductory statistics textbooks. A TI covers a proportion (p) of a probability distribution (such as a normal distribution) with a certain confidence level (1 - α). For example, a 95/99% TI covers 99% of a population with 95% confidence. Data from the new process would need to fall inside the TI calculated from the old process to exhibit comparability.
Note that the TI approach has several disadvantages compared with TOST, including:
- The TI approach is not a hypothesis-based test, meaning a p-value is not generated.
- Comparability is more difficult to correctly show with increasing new process data because one or more values could fall outside the interval by chance alone.
When insufficient data exist to power a statistical equivalency test such as TOST, the TI method may be an appropriate alternative. A useful technique to consider the adequacy of each approach is to perform a statistical performance assessment (SPA).
Consider a scenario in which 10 values are sampled from population A (the old process) and three values from population B (the new process). To calculate the SPA, assume the following:
- A and B are normally distributed with equal variances.
- The TOST goalpost is 2.5 times the standard deviation of A.
- A 90/99% TI will be calculated using data from A.
It is possible to calculate the probability of meeting the comparability criteria—namely, the statistical power for a TOST approach and the probability of all three values from population B falling inside the TI.
The results are shown in Figure 1 and Online Table 1.
If there is a one standard deviation difference across the means of A and B, there is about a 70% chance of incorrectly concluding the means are equal using TOST. But the chance jumps to more than 99% using the TI approach. Figure 1 shows the TI approach tends to conclude comparability more frequently than the TOST approach, regardless of the actual difference across the two means.
Using an SPA, all stakeholders can be made aware of the benefits and drawbacks associated with statistical approaches. A reasonable comparability strategy then may be decided on before collecting and analyzing data.
- Giselle B. Limentani, Moira C. Ringo, Feng Ye, Mandy L. Bergquist and Ellen O. McSorley, "Beyond the T-test: Statistical Equivalence Testing," Analytical Chemistry, Vol. 77, No. 11, 2005, pp. 221-226.
- I. Elaine Allen and Christopher A. Seaman, "Superiority, Equivalence and Non-Inferiority," Quality Progress, February 2007, pp. 52-54.
- Reed Harris, "Comparability Assessment Strategies and Techniques for Post-Approval CMC Changes," Fabian Lectures, 2008.
Keith M. Bower is a principal quality engineer at Amgen Inc. in Seattle. He earned a master’s degree in quality management and productivity from the University of Iowa in Iowa City. Bower is a senior member of ASQ and an ASQ-certified quality engineer, process analyst, technician and improvement associate.
Abraham Germansderfer is an associate director at Gilead Sciences in San Dimas, CA. He earned a master’s degree in biotechnology from Worcester Polytechnic Institute in Massachusetts.