The Real Deal
An explanation of real-world evidence in medicine and how to use it effectively
by Julia E. Seaman and I. Elaine Allen
The U.S. Food and Drug Administration (FDA) defines real-world evidence (RWE) as the clinical evidence derived from the use of a treatment or other medical product from analyzing real-world data (RWD).1 These data are generated from many different study designs and analyses. The use of RWD has become a lot more popular with the increased use of digital databases such as electronic health records (EHR) and the implementation of technology such as Apple Inc.’s HealthKit.
In the past, RWD was used solely to monitor post-market safety and adverse events, but recently has been included to support submissions for approval ahead of commercialization. Beyond regulatory uses, RWD and RWE also are used to support coverage decisions by healthcare insurance companies as well for guidelines and decision support tools.
One important application of RWE is its feasibility and ability to replicate clinical trial evidence, while others include safety monitoring with large heterogeneous populations and the ability to examine actual use of an approved therapy or device.
In many cases, RWD is generated from the integration of multiple sources of data that, in many cases, can be linked to specific patients or disease and diagnostic groups:
- Randomized clinical trials (RCT).
- Observational studies.
- Claims data.
- Pharmacy data.
- Patient registries.
- Lab tests.
- Patient surveys.
- Social media.
- Consumer data.
By using machine learning (ML) and artificial intelligence (AI) algorithms, patterns and insights can be gained and, in a small number of cases, possible key conclusions about the use of healthcare products can be generated. Having a large data set is necessary for the implementation of ML algorithms and implementing AI. There is a tendency to overvalue the sample size of RWD, however, if having a large sample size will minimize any biases or errors. Is that really the case?
How does RWE compare to RCTs?
Several recent papers have examined RCTs with RWE and RWD, and compared them to identify studies that could replicate RCT results. In one study, researchers examined EHRs to identify ways to replicate the protocols from RCTs with the rapidly accumulating RWD in hospitals.2 The researchers identified differences in purpose (efficacy vs. effectiveness) and follow-up (designed vs. actual practice) as being paramount, but also differences in patient groups (homogeneous vs. heterogeneous) and patient monitoring (per protocol vs. changeable) as important barriers to comparison between the two data gathering methods. There also were large differences in the ability to gather cost effective data and rapidly accumulating data.
Establishing a middle ground, pragmatic RCTs were examined by another group of researchers.3 A pragmatic RCT is one designed to mimic usual clinical practice and can be critical for evaluating effectiveness in an investigational product. For these trials to conform to FDA guidelines, the authors describe a nine-domain scoring method to evaluate a pragmatic trial (Pragmatic Explanatory Continuum Indicatory Summary-2) and ensure its effectiveness. The domains focus on recruiting patients and resources for the trial, and clearly identifying outcomes and analyses. In evaluating 89 trials, the researchers found only 36% used this tool to design and evaluate their pragmatic trials.
Taking a different approach, another group of researchers analyzed 220 U.S.-based clinical trials to identify what percentage could be replicated using observational data from insurance claims or health economics and outcomes research (HEOR) data.4 Only 33 (15%) could feasibly be replicated through RWE data. Most of the RCTs had outcomes that could not be ascertained by observational data alone. Despite these comparisons with RCTs, the use of RWE as an adjunct to RCTs certainly will increase as the ability to gather and merge data sets increases and becomes simpler. See Table 1.
Quality management barriers to using RWE
RWE is often based on analyses of data already collected and, unlike RCTs, may have only variable and rudimentary quality checks on the data being collected. Many of the sources for RWE studies are data collected to support clinical care and reimbursement. Others are from patient monitors or from self-reported information. These data are not gathered to support any specific research study and therefore can be misreported or misclassified by formal data entry procedures. RWD includes other pitfalls such as:
- Lack of consistency of self-reported data from patients, including the reporting of concomitant medications, comorbidities, drug compliance and accuracy of adherence.
- Activity monitors that may be taken off or miscalibrated.
- Lack of self-reported safety issues or drug reactions.
- Poor or incorrect translation of free text information to data from the HEOR.
- Incomplete or missing reporting of timing of clinic visits and tests.
Instead of using RWD itself, the creation of synthetic data sets from real EHR and other merged data is a possibility. Synthea, a software package developed by Mitre Corp.,5 creates synthetic EHRs that can be used to sample sets of subjects for a control group for a clinical trial that meet all the inclusion and exclusion criteria. An advantage is that there are no privacy concerns with the data. This provides a data set that can be validated and, if data exist in the original RWD, longitudinal data for synthetic patients can be generated to simulate the control group’s outcomes for a clinical trial.
Choosing synthetic patients works like this: Rather than collecting data from patients recruited for a trial who have been assigned to either the control or standard-of-care arm, synthetic control arms model those comparators, and inclusion and exclusion criteria using RWD that has previously been collected from multiple sources, such as health data generated during routine care—including EHRs, administrative claims data, patient-generated data from fitness trackers or home medical equipment, disease registries and historical clinical trial data.
According to the pharmaceutical industry, by reducing (or eliminating) recruiting control participants, a synthetic control arm will increase efficiency, reduce delays, lower trial costs and speed lifesaving therapies to market. Historic control arms have been used for FDA approval for interventions in trials of rare diseases, but synthetic patients have not yet been used for submission to the FDA for approval.
In 2015, IBM’s Watson for Oncology partnered with the University of Texas MD Anderson Medical Center to learn and create diagnostic and treatment algorithms by “ingesting” health records from real cancer patients as well as using the oncology literature.6 While good at summarizing and creating models when generalized to include both MD Anderson and Memorial Sloan-Kettering Oncology in New York, Watson scored high on its ability to create actionable data using natural language processing (96%), but was unable to integrate time-dependent information and devise new strategies for individual patients, scoring a 63% concordance rate with physician experts.
Differences in treatment protocols between the two oncology groups created inconsistencies in patient strategies when applying models created using the RWE at MD Anderson to oncology patients at Sloan-Kettering. This suggests that, in the absence of global treatment guidelines, different hospital settings will need either different RWE or different models in the foreseeable future. The hope that data are agnostic to treatment settings may never be realized.
As a statistician, it is easy to conclude that more data are always better. Recent use of large EHRs, claims data and Medicare databases, however, may lead to biases given the populations the data covers. A leap to RWE should be taken cautiously and steps to validate conclusions are important prior to deploying AI algorithms that may be location-specific, patient-specific or physician-specific.
There are clear benefits of analyzing and including RWE when possible, however, especially in areas where an idealized setting like an RCT may not reflect actual practices that can highlight unexpected outcomes.
References and Note
- U.S. Food and Drug Administration (FDA), “Submitting Documents Using Real-World Data and Real-World Evidence to FDA for Drugs and Biologics Guidance for Industry,” guidance document, May 2019.
- Hun-Sung Kim, Suehyun Lee and Ju Han Kim, “Real-World Evidence Versus Randomized Controlled Trial: Clinical Research Based on Electronic Medical Records,” Journal of Korean Medical Science, Vol. 33, No. 34, 2018.
- Rafael Dal-Ré, Perrine Janiaud and John PA Ioannidis, “Real-World Evidence: How Pragmatic Are Randomized Controlled Trials Labeled as Pragmatic?” BioMed Central Medicine, Vol. 16, No. 49, 2018.
- Victoria L. Bartlett, Sanket J. Dhruva, Nilay D. Shah, Patrick Ryan and Joseph S. Ross, “Feasibility of Using Real-World Data to Replicate Clinical Trial Evidence,” JAMA Network Open, Oct. 9, 2019.
- More information about SyntheticMass can be found at https://synthea.mitre.org.
- Eliza Strickland, “How IBM Watson Overpromised and Underdelivered on AI Health Care,” IEEE Spectrum, April 2, 2019.
Walonoski, Jason, Mark Kramer, Joseph Nichols, Andre Quina, Chris Moesel, Dylan Hall, Carlton Duffett, Kudakwashe Dube, Thomas Gallagher and Scott McLachlan, “Synthea: An Approach, Method, and Software Mechanism for Generating Synthetic Patients and the Synthetic Electronic Health Record,” Journal of the American Medical Informatics Association, Vol. 24, No. 3, Aug. 30, 2017, pp. 230-238.
Julia E. Seaman is research director of the Quahog Research Group and a statistical consultant for the Babson Survey Research Group at Babson College in Wellesley, MA. She earned her doctorate in pharmaceutical chemistry and pharmacogenomics from the University of California, San Francisco.
I. Elaine Allen is professor of biostatistics at the University of California, San Francisco, and emeritus professor of statistics at Babson College. She also is director of the Babson Survey Research Group. She earned a doctorate in statistics from Cornell University in Ithaca, NY. Allen is a member of ASQ.