Leveraging GPT-4o for Automated Extraction and Categorization of CAD-RADS Features From Free-Text Coronary CT Angiography Reports: Diagnostic Study.

JMIR Med Inform

Departments of Radiology, The Third Affiliated Hospital, Sun Yat-Sen University, 600 Tianhe Road, Guangzhou, Guangdong, 510630, China, 86 18922109279, 86 20852523108.

Published: September 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Despite the Coronary Artery Reporting and Data System (CAD-RADS) providing a standardized approach, radiologists continue to favor free-text reports. This preference creates significant challenges for data extraction and analysis in longitudinal studies, potentially limiting large-scale research and quality assessment initiatives.

Objective: To evaluate the ability of the generative pre-trained transformer (GPT)-4o model to convert real-world coronary computed tomography angiography (CCTA) free-text reports into structured data and automatically identify CAD-RADS categories and P categories.

Methods: This retrospective study analyzed CCTA reports from January 2024 and July 2024. A subset of 25 reports was used for prompt engineering to instruct the large language models (LLMs) in extracting CAD-RADS categories, P categories, and the presence of myocardial bridges and noncalcified plaques. Reports were processed using the GPT-4o API (application programming interface) and custom Python scripts. The ground truth was established by radiologists based on the CAD-RADS 2.0 guidelines. Model performance was assessed using accuracy, sensitivity, specificity, and F1-score. Intrarater reliability was assessed using Cohen κ coefficient.

Results: Among 999 patients (median age 66 y, range 58-74; 650 males), CAD-RADS categorization showed accuracy of 0.98-1.00 (95% CI 0.9730-1.0000), sensitivity of 0.95-1.00 (95% CI 0.9191-1.0000), specificity of 0.98-1.00 (95% CI 0.9669-1.0000), and F1-score of 0.96-1.00 (95% CI 0.9253-1.0000). P categories demonstrated accuracy of 0.97-1.00 (95% CI 0.9569-0.9990), sensitivity from 0.90 to 1.00 (95% CI 0.8085-1.0000), specificity from 0.97 to 1.00 (95% CI 0.9533-1.0000), and F1-score from 0.91 to 0.99 (95% CI 0.8377-0.9967). Myocardial bridge detection achieved an accuracy of 0.98 (95% CI 0.9680-0.9870), and noncalcified coronary plaques detection showed an accuracy of 0.98 (95% CI 0.9680-0.9870). Cohen κ values for all classifications exceeded 0.98.

Conclusions: The GPT-4o model efficiently and accurately converts CCTA free-text reports into structured data, excelling in CAD-RADS classification, plaque burden assessment, and detection of myocardial bridges and calcified plaques.

Download full-text PDF

Source
http://dx.doi.org/10.2196/70967DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12422720PMC

Publication Analysis

Top Keywords

free-text reports
12
95%
10
gpt-4o model
8
ccta free-text
8
reports structured
8
structured data
8
cad-rads categories
8
myocardial bridges
8
098-100 95%
8
100 95%
8

Similar Publications

Large language models (LLMs) have been successfully used for data extraction from free-text radiology reports. Most current studies were conducted with LLMs accessed via an application programming interface (API). We evaluated the feasibility of using open-source LLMs, deployed on limited local hardware resources for data extraction from free-text mammography reports, using a common data element (CDE)-based structure.

View Article and Find Full Text PDF

Adults with chronic low back pain hold negative beliefs towards running: a mixed methods study.

J Sci Med Sport

August 2025

Eastern Health Clinical School, Monash University, Australia; Eastern Health Emergency Medicine Program, Australia. Electronic address:

Objectives: To explore differences in beliefs towards running in adults with and without chronic low back pain.

Design: This convergent mixed methods cross-sectional study compared adults with chronic low back pain (n = 39) to pain-free adults with a history of chronic low back pain (n = 28) and a low back pain naive control group (n = 71).

Methods: Beliefs towards running (activity specific beliefs questionnaire; range: 1-4 points), pain intensity (101-point visual analogue scale), disability (Oswestry Disability Index), and habitual physical activity (International Physical Activity Questionnaire) were analysed.

View Article and Find Full Text PDF

Introduction: Multimorbidity contributes significantly to poor population health outcomes while straining healthcare systems. Although some multimorbid patients experience an accelerated health decline (a decline in well-being or functional status that cannot be attributed to the natural ageing-related health deterioration), others can remain stable for years. Identifying risk factors for accelerated health decline in persons with multimorbidity could help prevent complications and reduce unnecessary interventions.

View Article and Find Full Text PDF

Leveraging GPT-4o for Automated Extraction and Categorization of CAD-RADS Features From Free-Text Coronary CT Angiography Reports: Diagnostic Study.

JMIR Med Inform

September 2025

Departments of Radiology, The Third Affiliated Hospital, Sun Yat-Sen University, 600 Tianhe Road, Guangzhou, Guangdong, 510630, China, 86 18922109279, 86 20852523108.

Background: Despite the Coronary Artery Reporting and Data System (CAD-RADS) providing a standardized approach, radiologists continue to favor free-text reports. This preference creates significant challenges for data extraction and analysis in longitudinal studies, potentially limiting large-scale research and quality assessment initiatives.

Objective: To evaluate the ability of the generative pre-trained transformer (GPT)-4o model to convert real-world coronary computed tomography angiography (CCTA) free-text reports into structured data and automatically identify CAD-RADS categories and P categories.

View Article and Find Full Text PDF

Background: Electronic health records (EHRs) are a cornerstone of modern health care delivery, but their current configuration often fragments information across systems, impeding timely and effective clinical decision-making. In gynecological oncology, where care involves complex, multidisciplinary coordination, these limitations can significantly impact the quality and efficiency of patient management. Few studies have examined how EHR systems support clinical decision-making from the perspective of end users.

View Article and Find Full Text PDF