## 2020

STATISTICS ROUNDTABLE

# Cumulative Meta-Analysis

by I. Elaine Allen and Christopher A. Seaman

Meta-analysis is a set of statistical procedures designed to integrate and synthesize experimental results across independent studies into an overall summary statistic. Unlike traditional research methods, meta-analysis uses the summary statistics from individual studies as the data points.

Mostly used in education, psychology and medicine,
meta-analysis can also be applied to quality control.^{1,
2} Published studies are most often used in meta-analyses,
but the methodology can also be applied to internal studies. And
though meta-analyses are typically used to compare two treatments
(or a new treatment with a control), they can also be used to
examine two processes or a standard and improved product.

A key assumption of meta-analysis is that each study provides an independent estimate of the underlying relationship within an unknown—and probably unknowable—population. Accumulating results across studies, meta-analysis offers new insights about the population and studies. It allows researchers to:

- Gain more statistical power as similar results from different studies are combined.
- Provide a more accurate representation of the population relationship than is provided by the individual study estimators.
- Cumulatively combine studies chronologically to identify when a characteristic or statistically significant change first occurs.
- Understand the heterogeneity of the process or outcome being studied.

We will focus on the use of cumulative meta-analysis and an application in product improvement. Cumulative meta-analysis can also be directly applied in manufacturing to identify the time a process significantly changed or to compare old and new processes or products by synthesizing information from multiple experiments.

In cumulative meta-analysis the experiments are accumulated from the earliest to the latest, where each successive experiment includes a synthesis of all previous experiments. This chronological combining of the experiments will show if there is a consistency in the results of consecutive experiments and indicate the point at which no further experiments are necessary because the results continually favor one process, product or treatment.

### Four Steps

A meta-analysis is a multistage process involving protocol,
study identification, data extraction and
synthesis:^{3}

**Protocol:**Not every study is relevant to the question at hand, so the first step is to specify the criteria for identifying suitable studies or experiments for the meta-analysis. A prospectively defined protocol for the meta-analysis specifying criteria for inclusion and data being extracted is essential. These criteria should be operationally defined and rigidly decisive in triaging experiments to be included or excluded from analysis. The criteria should specify the types of test and control conditions as well as which reported outcomes each study must have. This step is somewhat simpler in quality control applications because tests of a new product or process are all being conducted under similar test conditions.**Study identification:**The next step is to apply the criteria as a filter to find the studies needed. In clinical meta-analyses, this involves exhaustive searching of the literature for any and all published studies that meet the protocol defined criteria in the first stage. In a quality control application, however, all previous studies would already be housed and indexed internally.**Data extraction:**Each study or experiment that reaches this point should have relevant data to be extracted. So the next step is to calculate a result (usually called the effect size, point estimate or summary statistic) with an accompanying estimate of the variation the researchers would expect with studies of this type (the standard deviation, confidence interval or range).**Synthesis:**First, determine whether it is appropriate to calculate a synthesized average result across studies. If so, then calculate and present such a result. The type and calculation of this summary statistic depends on the type of data available (discrete vs. continuous variables) and whether it is a comparative synthesis.

Not every study in a meta-analysis is equally important. Studies that give more information by using a larger sample size or a smaller degree of variability should be given more credibility because their results are likely to be closer to the truth you are trying to estimate. The results of meta-analyses are often presented in a forest plot giving the point estimate and confidence interval for each study as well as a summary point estimate and summary confidence bound.

Meta-analysis is not a simple pooling of the data from multiple studies as if they were one large study. Simple aggregation results are usually incorrect. Instead, a meta-analysis looks at the results, sample size and variability within each study and then calculates a weighted average summary effect size.

### Two Types of Analysis

The two most common types of meta-analysis models are fixed
effects models (FEMs)^{4} and random effects models
(REMs).^{5} The Mantel-Haenszel FEM computes the effect
estimate as a weighted average of the individual study estimates,
each weighted by the inverse of the study variance. The equation
for computing the effect estimate for the FEM is μ = Σ
y_{i}w_{i} / Σ w_{I}, where
w_{i} = 1 / σ_{i}^{2}.

By assuming the experiments are homogeneous and the estimate
follows a normal distribution, a confidence interval can be
computed in the usual manner with the standard error of the
weighted average using the equation SE = 1 /
√Σw_{i}. The REM assumes each study has its
own mean, μ_{i}, and variance,
σ_{i}^{2} but the μ_{i} are
drawn from a superpopulation of effects each with its own mean,
μ, and variance, τ^{2}, which describe the
between-study heterogeneity.

As in the FEM, μ in the REM is estimated by a weighted
average of the study effects, but the weights,
1/(τ^{2} + σ_{i}^{2}), are the
inverses of the sums of the within-study variances,
σ_{i}^{2}, and the between-study variance,
τ^{2}. When τ^{2} = 0 so the treatment
effects, μi, are all the same, it reduces to become the FEM.
The DerSimonian and Laird REM estimate of τ^{2} is
used in the weighting formula. The equation for computing the
effect estimate for REM is μ = Σ
y_{i}w_{i} / Σ w_{i}, where
w_{i} = 1 / (τ^{2} +
σ_{i}^{2}).

Because of the different weighting, experiments with a larger
sample size have less effect on the REM estimate than on the FEM
estimate. Confidence intervals are generally wider in the REM,
and because the inclusion of τ^{2} accounts for the
nonrandom variability between studies, the REM gives more
conservative estimates of variance.

The equation of standard error for computing the confidence
interval is SE = (Σ (D + w_{i} - 1) - 1) - 1 / 2,
where D denotes the variance of each experiment effect size. The
REM usually provides a more conservative estimate and is
particularly useful in checking the robustness of a significant
result obtained using FEMs. FEMs assume any variability between
results of experiments is completely random error, while REMs
assume there may be experiment specific errors. That’s why
we recommend and use REMs for all meta-analyses.

### Example

These data come from a consumer products company that introduced an improved version of an existing product. The primary objective of the project was to identify the degree of superiority of the new product by meta-analyzing the data from all experiments run by the company. During the course of the project, the company realized that if it had used meta-analysis as an ongoing step in the product’s development, it could have shown the superiority of its new formulation sooner and launched the product a year earlier.

The company wanted to compare the new formulation (New) with
its existing standard formulation (Standard) to ensure
New’s superiority prior to market launch. Prior to
introducing New, the company conducted more than 200 blind,
controlled internal experiments compar- ing Standard to New to
validate

its claims. After performing these experiments, the company
concluded New was significantly better than Standard.

Using cumulative meta-analysis methods, the company could have stopped after only 20 experiments and launched the product earlier because the results of the cumulative meta-analysis revealed the significant difference between Standard and New was consistent for the 180-plus experiments conducted after that point. Each experiment involved the comparison between Standard and New, where the difference between them would be negative if the New formulation was superior to the Standard formulation.

This is illustrated in Tables 1 and 2, which show only the
first 25 experiments. Table 1 shows the meta-analysis of each
study separately, and Table 2 shows the effect of accumulating
the results over studies and time. If only Table 1 is viewed, it
is clear from the line labeled “Random” that New is
superior but not whether it is consistently superior over
consecutive experiments. Only in Table 2 does this become clear
because the I-bars lie completely in the negative after 20
experiments, indicating a significant difference.^{6}

The primary goal of the process was to identify a significant change for which using noncumulative meta-analysis would suffice. However, using cumulative meta-analysis changed the way the company planned and analyzed experiments on new and standard products from that point forward.

### REFERENCES AND NOTES

- Matthias Egger, George D. Smith and Douglas G. Altman, editors, Systematic Reviews in Health Care: Meta-Analysis in Context, BMJ Books, 2001.
- Ralf Schulze, Heinz Holling and Dankmar Bohning, editors, Meta-Analysis: New Developments and Applications in Medical and Social Sciences, Hogrefe & Huber, 2003.
- The Cochrane Collaboration, Cochrane Reviewers’ Handbook 4.2.2, 2004,www.cochrane.org/ resources/handbook/.
- N. Mantel and W. Haenszel, “Statistical Aspects of the Analysis of Data From Retrospec-tive Studies of Disease,” Journal of the National Cancer Institute, Vol. 51, No. 22, pp. 19-48.
- Rebecca DerSimonian and Nan Laird, “Meta-Analysis in Clinical Trials,” Controlled Clinical Trials, Vol. 7, No. 3, pp. 177-188.
- All calculations were conducted in Comprehensive Meta-Analysis Version 2.0, 2005, http://meta-analysis.com/index.html.

**I. ELAINE ALLEN** is professor of statistics
and entrepreneurship at Babson College in Wellesley, MA. She
earned a doctorate in statistics from Cornell University in
Ithaca, NY and is a member of ASQ.

**CHRISTOPHER A. SEAMAN** is a statistical
researcher at Human Services Research Institute in Cambridge,
MA.

Featured advertisers