Belumosudil was FDA-approved in the United States (US) for the treatment of relapsed/refractory chronic graft-versus-host disease (cGVHD) based on a randomized phase II trial comparing two belumosudil doses. The efficacy and safety of belumosudil versus the best available therapy (BAT) have not been studied. Applying rigorous statistical methodology to real-world data, this study estimated the efficacy of belumosudil versus BAT in cGVHD patients whose disease failed to respond to 2-5 prior lines of therapy (LOTs).
View Article and Find Full Text PDFClin Pharmacol Ther
July 2025
Developing drugs for rare diseases presents unique challenges from a statistical perspective. These challenges may include slowly progressive diseases with unmet medical needs, poorly understood natural history, small population size, diversified phenotypes and genotypes within a disorder, and lack of appropriate surrogate endpoints to measure clinical benefits. The Real-World Evidence (RWE) Scientific Working Group of the American Statistical Association Biopharmaceutical Section has assembled a research team to assess the landscape including challenges and possible strategies to address these challenges and the role of real-world data (RWD) and RWE in rare disease drug development.
View Article and Find Full Text PDFClin Pharmacol Ther
April 2025
Real-world data (RWD) and real-world evidence (RWE) have been increasingly used in medical product development and regulatory decision-making, especially for rare diseases. After outlining the challenges and possible strategies to address the challenges in rare disease drug development (see the accompanying paper), the Real-World Evidence (RWE) Scientific Working Group of the American Statistical Association Biopharmaceutical Section reviews the roles of RWD and RWE in clinical trials for drugs treating rare diseases. This paper summarizes relevant guidance documents and frameworks by selected regulatory agencies and the current practice on the use of RWD and RWE in natural history studies and the design, conduct, and analysis of rare disease clinical trials.
View Article and Find Full Text PDFJ Am Med Inform Assoc
August 2024
Objective: To present a general framework providing high-level guidance to developers of computable algorithms for identifying patients with specific clinical conditions (phenotypes) through a variety of approaches, including but not limited to machine learning and natural language processing methods to incorporate rich electronic health record data.
Materials And Methods: Drawing on extensive prior phenotyping experiences and insights derived from 3 algorithm development projects conducted specifically for this purpose, our team with expertise in clinical medicine, statistics, informatics, pharmacoepidemiology, and healthcare data science methods conceptualized stages of development and corresponding sets of principles, strategies, and practical guidelines for improving the algorithm development process.
Results: We propose 5 stages of algorithm development and corresponding principles, strategies, and guidelines: (1) assessing fitness-for-purpose, (2) creating gold standard data, (3) feature engineering, (4) model development, and (5) model evaluation.
Increasing emphasis on the use of real-world evidence (RWE) to support clinical policy and regulatory decision-making has led to a proliferation of guidance, advice, and frameworks from regulatory agencies, academia, professional societies, and industry. A broad spectrum of studies use real-world data (RWD) to produce RWE, ranging from randomized trials with outcomes assessed using RWD to fully observational studies. Yet, many proposals for generating RWE lack sufficient detail, and many analyses of RWD suffer from implausible assumptions, other methodological flaws, or inappropriate interpretations.
View Article and Find Full Text PDFBackground: Real-world data, such as administrative claims and electronic health records, are increasingly used for safety monitoring and to help guide regulatory decision-making. In these settings, it is important to document analytic decisions transparently and objectively to assess and ensure that analyses meet their intended goals.
Methods: The Causal Roadmap is an established framework that can guide and document analytic decisions through each step of the analytic pipeline, which will help investigators generate high-quality real-world evidence.
BMC Med Res Methodol
August 2023
Background: The Targeted Learning roadmap provides a systematic guide for generating and evaluating real-world evidence (RWE). From a regulatory perspective, RWE arises from diverse sources such as randomized controlled trials that make use of real-world data, observational studies, and other study designs. This paper illustrates a principled approach to assessing the validity and interpretability of RWE.
View Article and Find Full Text PDFBackground: In rural China, exclusive breastfeeding (EBF) prevalence is low and hospitals often fail to attain baby-friendly feeding objectives, such as ≥ 75% of newborns exclusively breastfed from birth to discharge. Empirical evidence for the impact of increased hospital compliance with recommended feeding guidelines on continued EBF in rural China is lacking. We sought to measure and model the association of newborns' in-hospital feeding experiences with EBF practice in infancy to inform policies for EBF promotion.
View Article and Find Full Text PDFCommon tasks encountered in epidemiology, including disease incidence estimation and causal inference, rely on predictive modelling. Constructing a predictive model can be thought of as learning a prediction function (a function that takes as input covariate data and outputs a predicted value). Many strategies for learning prediction functions from data (learners) are available, from parametric regressions to machine learning algorithms.
View Article and Find Full Text PDFPsychol Addict Behav
November 2023
Objective: To examine the relative importance of client change language subtypes as predictors of alcohol use following motivational interviewing (MI).
Method: Participants were 164 heavy drinkers (57.3% female, = 28.
We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015-2019 in 2 integrated health-care institutions in the Northwest United States. We used one site's manually reviewed gold-standard outcomes data for model development and the other's for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site.
View Article and Find Full Text PDFInverse probability weighting (IPW) and targeted maximum likelihood estimation (TMLE) are methodologies that can adjust for confounding and selection bias and are often used for causal inference. Both estimators rely on the positivity assumption that within strata of confounders there is a positive probability of receiving treatment at all levels under consideration. Practical applications of IPW require finite inverse probability (IP) weights.
View Article and Find Full Text PDFBackground: Anaphylaxis is a life-threatening allergic reaction that is difficult to identify accurately with administrative data. We conducted a population-based validation study to assess the accuracy of ICD-10 diagnosis codes for anaphylaxis in outpatient, emergency department, and inpatient settings.
Methods: In an integrated healthcare system in Washington State, we obtained medical records from healthcare encounters with anaphylaxis diagnosis codes (potential events) from October 2015 to December 2018.
We use simulated data to examine the consequences of depletion of susceptibles for hazard ratio (HR) estimators based on a propensity score (PS). First, we show that the depletion of susceptibles attenuates marginal HRs toward the null by amounts that increase with the incidence of the outcome, the variance of susceptibility, and the impact of susceptibility on the outcome. If susceptibility is binary then the Bross bias multiplier, originally intended to quantify bias in a risk ratio from a binary confounder, also quantifies the ratio of the instantaneous marginal HR to the conditional HR as susceptibles are depleted differentially.
View Article and Find Full Text PDFBackground: A substantial fraction of sexually transmitted infections (STIs) occur in patients who have previously been treated for an STI. We assessed whether routine electronic health record (EHR) data can predict which patients presenting with an incident STI are at greatest risk for additional STIs in the next 1 to 2 years.
Methods: We used structured EHR data on patients 15 years or older who acquired an incident STI diagnosis in 2008 to 2015 in eastern Massachusetts.
Unlabelled: Human immunodeficiency virus (HIV) pre-exposure prophylaxis (PrEP) protects high risk patients from becoming infected with HIV. Clinicians need help to identify candidates for PrEP based on information routinely collected in electronic health records (EHRs). The greatest statistical challenge in developing a risk prediction model is that acquisition is extremely rare.
View Article and Find Full Text PDFBackground: Evidence on the risk of febrile seizures after inactivated influenza vaccine (IIV) and 13-valent pneumococcal conjugate vaccine (PCV13) is mixed. In the FDA-sponsored Sentinel Initiative, we examined risk of febrile seizures after IIV and PCV13 in children 6-23 months of age during the 2013-14 and 2014-15 influenza seasons.
Methods: Using claims data and a self-controlled risk interval design, we compared the febrile seizure rate in a risk interval (0-1 days) versus control interval (14-20 days).
The role of acute mood states as mediating factors in cognitive impairment in patients with mania or depression is not sufficiently clear. Similarly, the extent to which cognitive impairment is trait or state-specific remains an open question. Therefore, the aim of this study was to investigate the effect of a mood-induction on attention in patients with an affective disorder.
View Article and Find Full Text PDFBackground: HIV pre-exposure prophylaxis (PrEP) is effective but underused, in part because clinicians do not have the tools to identify PrEP candidates. We developed and validated an automated prediction algorithm that uses electronic health record (EHR) data to identify individuals at increased risk for HIV acquisition.
Methods: We used machine learning algorithms to predict incident HIV infections with 180 potential predictors of HIV risk drawn from EHR data from 2007-15 at Atrius Health, an ambulatory group practice in Massachusetts, USA.