Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Diagnosing myocarditis relies on multimodal data, including cardiovascular magnetic resonance (CMR), clinical symptoms, and blood values. The correct interpretation and integration of CMR findings require radiological expertise and knowledge. We aimed to investigate the performance of Generative Pre-trained Transformer 4 (GPT-4), a large language model, for report-based medical decision-making in the context of cardiac MRI for suspected myocarditis.

Methods: This retrospective study includes CMR reports from 396 patients with suspected myocarditis and eight centers, respectively. CMR reports and patient data including blood values, age, and further clinical information were provided to GPT-4 and radiologists with 1 (resident 1), 2 (resident 2), and 4 years (resident 3) of experience in CMR and knowledge of the 2018 Lake Louise Criteria. The final impression of the report regarding the radiological assessment of whether myocarditis is present or not was not provided. The performance of Generative pre-trained transformer 4 (GPT-4) and the human readers were compared to a consensus reading (two board-certified radiologists with 8 and 10 years of experience in CMR). Sensitivity, specificity, and accuracy were calculated.

Results: GPT-4 yielded an accuracy of 83%, sensitivity of 90%, and specificity of 78%, which was comparable to the physician with 1 year of experience (R1: 86%, 90%, 84%, p = 0.14) and lower than that of more experienced physicians (R2: 89%, 86%, 91%, p = 0.007 and R3: 91%, 85%, 96%, p < 0.001). GPT-4 and human readers showed a higher diagnostic performance when results from T1- and T2-mapping sequences were part of the reports, for residents 1 and 3 with statistical significance (p = 0.004 and p = 0.02, respectively).

Conclusion: GPT-4 yielded good accuracy for diagnosing myocarditis based on CMR reports in a large dataset from multiple centers and therefore holds the potential to serve as a diagnostic decision-supporting tool in this capacity, particularly for less experienced physicians. Further studies are required to explore the full potential and elucidate educational aspects of the integration of large language models in medical decision-making.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11414660PMC
http://dx.doi.org/10.1016/j.jocmr.2024.101068DOI Listing

Publication Analysis

Top Keywords

generative pre-trained
12
pre-trained transformer
12
cardiovascular magnetic
8
magnetic resonance
8
suspected myocarditis
8
data including
8
blood values
8
performance generative
8
transformer gpt-4
8
cmr reports
8

Similar Publications

Leveraging GPT-4o for Automated Extraction and Categorization of CAD-RADS Features From Free-Text Coronary CT Angiography Reports: Diagnostic Study.

JMIR Med Inform

September 2025

Departments of Radiology, The Third Affiliated Hospital, Sun Yat-Sen University, 600 Tianhe Road, Guangzhou, Guangdong, 510630, China, 86 18922109279, 86 20852523108.

Background: Despite the Coronary Artery Reporting and Data System (CAD-RADS) providing a standardized approach, radiologists continue to favor free-text reports. This preference creates significant challenges for data extraction and analysis in longitudinal studies, potentially limiting large-scale research and quality assessment initiatives.

Objective: To evaluate the ability of the generative pre-trained transformer (GPT)-4o model to convert real-world coronary computed tomography angiography (CCTA) free-text reports into structured data and automatically identify CAD-RADS categories and P categories.

View Article and Find Full Text PDF

Motivation: Recent pandemics have revealed significant gaps in our understanding of viral pathogenesis, exposing an urgent need for methods to identify and prioritize key host proteins (host factors) as potential targets for antiviral treatments. De novo generation of experimental datasets is limited by their heterogeneity, and for looming future pandemics, may not be feasible due to limitations of experimental approaches.

Results: Here we present TransFactor, a computational framework for predicting and prioritizing candidate host factors using only protein sequence data.

View Article and Find Full Text PDF

Background: The use of artificial intelligence platforms by medical residents as an educational resource is increasing. Within orthopaedic surgery, older Chat Generative Pre-trained Transformer (ChatGPT) models performed worse than resident physicians on practice examinations and rarely answered questions with images correctly. The newer ChatGPT-4o was designed to improve these deficiencies but has not been evaluated.

View Article and Find Full Text PDF

Diffuse large B-cell lymphoma is the most common type of non-Hodgkin lymphoma (NHL) in humans, accounting for about 30-40% of NHL cases worldwide. Canine diffuse large B-cell lymphoma (cDLBCL) is the most common lymphoma subtype in dogs and demonstrates an aggressive biologic behaviour. For tissue biopsies, current confirmatory diagnostic approaches for enlarged lymph nodes rely on expert histopathological assessment, which is time-consuming and requires specialist expertise.

View Article and Find Full Text PDF

Background: Undifferentiated pleomorphic sarcoma (UPS) is a prevalent soft tissue sarcoma subtype associated with poor prognosis. Current prognostic tools lack the ability to incorporate personalized data for predicting survival. Machine learning (ML) offers a potential solution to enhance survival prediction accuracy.

View Article and Find Full Text PDF