OnSIDES database: Extracting adverse drug events from drug labels using natural language processing models.

Med

Department of Biomedical Informatics, Columbia University Irving Medical Center, Columbia University, New York, NY 10032, USA; Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA 90069, USA; Cedars-Sinai Cancer, Cedars-Sinai Medical Center, Los Angeles, CA 90069, US

Published: July 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Adverse drug events (ADEs) are the fourth leading cause of death in the US and cost billions of dollars annually in increased healthcare costs. However, few machine-readable databases of ADEs exist, limiting our capacity to study drug safety on a broader, systematic scale. Recent advances in natural language processing methods, such as BERT models, present an opportunity to accurately extract relevant information from unstructured biomedical text.

Methods: We fine-tune a PubMedBERT model to extract ADE terms from text in FDA Structured Product Labels for prescription drugs. Here, we present OnSIDES (on-label side effects resource), a compiled, machine-friendly database of drug-ADE pairs generated with this method. We further utilize this method to extract pediatric-specific ADEs, serious ADEs from labels' "Boxed Warnings" section, and ADEs from drug labels of other major nations-the UK, the European Union, and Japan-to build a complementary OnSIDES-INTL database. To present OnSIDES' potential applications, we leverage the database to predict novel drug targets and indications, analyze enrichment of ADEs across drug classes, and predict novel ADEs from chemical compound structures.

Findings: We achieve an F1 score of 0.90, AUROC of 0.92, and AUPR of 0.95 at extracting ADEs from the labels' "Adverse Reactions" section. OnSIDES contains over 3.6 million drug-ADE pairs for 3,233 unique drug ingredient combinations extracted from 47,211 labels.

Conclusions: OnSIDES can be used as a comprehensive resource to study and enhance drug safety.

Funding: R35GM131905 to N.P.T.; T32GM145440 to H.Y.C.; and T15LM007079 to U.G., M.Z., and K.L.B.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12256195PMC
http://dx.doi.org/10.1016/j.medj.2025.100642DOI Listing

Publication Analysis

Top Keywords

drug
9
adverse drug
8
drug events
8
drug labels
8
natural language
8
language processing
8
ades
8
drug-ade pairs
8
ades labels'
8
ades drug
8

Similar Publications

Background: The treatment of critically ill patients in intensive care units is becoming increasingly complex. For example, organ transplants are regularly carried out, the recipients are seriously ill, and the postoperative course can be complicated. This is why organ replacement and hemadsorption procedures are becoming increasingly important.

View Article and Find Full Text PDF

Nuclear receptors (NRs) are a superfamily of ligand-activated transcription factors that regulate gene expression in response to metabolic, hormonal, and environmental signals. These receptors play a critical role in metabolic homeostasis, inflammation, immune function, and disease pathogenesis, positioning them as key therapeutic targets. This review explores the mechanistic roles of NRs such as PPARs, FXR, LXR, and thyroid hormone receptors (THRs) in regulating lipid and glucose metabolism, energy expenditure, cardiovascular health, and neurodegeneration.

View Article and Find Full Text PDF

This study investigated the impact of dietary zeolite supplementation on growth, cecal microbiota and digesta viscosity, digestive enzymes, carcass traits, blood constituents, and antioxidant parameters of broilers. A completely randomized design was used with 240 one-day-old broiler chicks randomly assigned to three dietary treatments (0%, 1.5%, and 3% zeolite as a feed additive) with four replicates of 20 chicks each.

View Article and Find Full Text PDF

Background: This study examines trends in delta-9-tetrahydrocannabinol-9-carboxylic acid (THC-COOH) positivity rates in pre-employment urine drug screenings at a single university-based hospital occupational medicine clinic from 2017 to 2022, following California's recreational cannabis legalization in 2016, with sales beginning officially on January 1, 2018.

Methods: Retrospective analysis of 21,546 de-identified urine drug screenings from 2017 to 2022 was conducted. Initial screening used instant urine drug immunoassays (50 ng/mL cutoff for THC-COOH), followed by confirmatory gas chromatography-mass spectrometry (15 ng/mL cutoff).

View Article and Find Full Text PDF

The MetaboHealth score is an indicator of physiological frailty in middle aged and older individuals. The aim of the current study was to explore which molecular pathways co-vary with the MetaboHealth score. Using a Luminex cytokine assay and liquid chromatography-mass spectrometry-based proteomics we explored the plasma proteins associating with the difference in 100 extreme scoring individuals selected from two large population cohorts, the Leiden Longevity Study (LLS) and the Rotterdam Study (RS), and discordant monozygotic twin pairs from the Netherlands Twin Register (NTR).

View Article and Find Full Text PDF