AI for Extracting Pre-Analytical Variability Data from Biomedical Literature: Feasibility and Validation.

Stud Health Technol Inform

Hannover Unified Biobank (HUB), Medizinische Hochschule Hannover, Carl Neuberg Str.1, 30625 Hannover.

Published: September 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Introduction: The quality and reproducibility of research results from biological samples are significantly influenced by the pre-analytical variability resulting from different conditions during sample collection, storage and processing. Although numerous studies have investigated their effects, standardized and structured reporting remains limited, hindering systematic evaluation. This study explores the potential of Large Language Models (LLMs) for the structured extraction of pre-analytical variability data from scientific literature.

Methods: Using a standardized parameter catalog, various LLMs were evaluated with specially designed prompts.

Results: Models such as GPT-4.5, o1, DeepSeek R1, and o3 mini high demonstrated promising performance in contextual understanding and structured output generation, particularly for CSV files. However, consistent semantic mapping of complex experimental conditions (e.g., storage time versus temperature) proved challenging.

Conclusion: Targeted token reduction significantly improved extraction quality. Overall, the study shows that LLMs can serve as effective tools for supporting structured data extraction in biomedical contexts-though current limitations in reproducibility and contextual fidelity highlight the continued need for expert oversight.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI251379DOI Listing

Publication Analysis

Top Keywords

pre-analytical variability
12
variability data
8
extracting pre-analytical
4
data biomedical
4
biomedical literature
4
literature feasibility
4
feasibility validation
4
validation introduction
4
introduction quality
4
quality reproducibility
4

Similar Publications

The Quantiferon Gold Plus (QFT) test, a widely used interferon-γ release assay (IGRA), diagnoses latent tuberculosis infection (LTBI) with a positivity threshold of ≥0.35 IU/mL. Results near this cut-off can be challenging to interpret due to variability from immunological, pre-analytical, and technical factors, prompting recommendations for a borderline range to refine diagnosis and reduce overtreatment.

View Article and Find Full Text PDF

Central nervous system (CNS) involvement in acute lymphoblastic leukemia (ALL) is associated with a poor prognosis, making its accurate detection vital for treatment planning. This systematic review critically examines the role of conventional cytomorphology (CC) and multiparameter flow cytometry (FC) in analyzing cerebrospinal fluid in acute lymphoblastic leukemia cases. While CC remains the gold standard, its sensitivity is limited, particularly in cases with low cell counts.

View Article and Find Full Text PDF

ProvideQ: A Web-Based Knowledge Platform for Assessing Preanalytical Influences on Biomolecules in Biospecimens.

Stud Health Technol Inform

September 2025

Hannover Unified Biobank (HUB), Medizinische Hochschule Hannover, Carl Neuberg Str.1, 30625 Hannover.

Introduction: Preanalytical factors significantly impact the stability of biomolecules in biospecimens, affecting the reliability of biomedical research and diagnostics. This paper presents the development of ProvideQ (Database for pre-analytical variability and biospecimen quality), a web-based platform designed to centralize access to research findings on these influences.

Methods: Building on an initial prototype, we implemented a validated criteria catalog for data quality, an efficient search system handling incomplete inputs, and SPREC 4.

View Article and Find Full Text PDF

AI for Extracting Pre-Analytical Variability Data from Biomedical Literature: Feasibility and Validation.

Stud Health Technol Inform

September 2025

Hannover Unified Biobank (HUB), Medizinische Hochschule Hannover, Carl Neuberg Str.1, 30625 Hannover.

Introduction: The quality and reproducibility of research results from biological samples are significantly influenced by the pre-analytical variability resulting from different conditions during sample collection, storage and processing. Although numerous studies have investigated their effects, standardized and structured reporting remains limited, hindering systematic evaluation. This study explores the potential of Large Language Models (LLMs) for the structured extraction of pre-analytical variability data from scientific literature.

View Article and Find Full Text PDF

Minimizing high sensitivity troponin T delta variation at low concentration using BD Barricor blood collection tube.

Clin Biochem

August 2025

Department of Laboratory Medicine and Pathology, College of Health Science, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada.

Objective: Reproducible low troponin concentrations from high-sensitivity troponin (hs-cTn) assays are paramount to accurate risk determination in the accelerated diagnostic pathway. Total variation consists of pre-analytical, analytical and biological components. While analytical and biological variations cannot be readily modifiable, minimizing pre-analytical variation is desirable and potentially attainable.

View Article and Find Full Text PDF