Automated MRI protocoling in neuroradiology in the era of large language models.

Lara Noelle Reiner , Moudather Chelbi , Leonard Fetscher , Juliane C Stöckel , Christoph Csapó-Schmidt , Shakhnaz Guseynova , Fares Al Mohamad , Keno Kyrill Bressem , Jawed Nawabi , Eberhard Siebert , Mike P Wattjes , Michael Scheel , Aymen Meddeb

Radiol Med

Department of Neuroradiology, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany.

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Purpose: This study investigates the automation of MRI protocoling, a routine task in radiology, using large language models (LLMs), comparing an open-source (LLama 3.1 405B) and a proprietary model (GPT-4o) with and without retrieval-augmented generation (RAG), a method for incorporating domain-specific knowledge.

Material And Methods: This retrospective study included MRI studies conducted between January and December 2023, along with institution-specific protocol assignment guidelines. Clinical questions were extracted, and a neuroradiologist established the gold standard protocol. LLMs were tasked with assigning MRI protocols and contrast medium administration with and without RAG. The results were compared to protocols selected by four radiologists. Token-based symmetric accuracy, the Wilcoxon signed-rank test, and the McNemar test were used for evaluation.

Results: Data from 100 neuroradiology reports (mean age = 54.2 years ± 18.41, women 50%) were included. RAG integration significantly improved accuracy in sequence and contrast media prediction for LLama 3.1 (Sequences: 38% vs. 70%, P < .001, Contrast Media: 77% vs. 94%, P < .001), and GPT-4o (Sequences: 43% vs. 81%, P < .001, Contrast Media: 79% vs. 92%, P = .006). GPT-4o outperformed LLama 3.1 in MRI sequence prediction (81% vs. 70%, P < .001), with comparable accuracies to the radiologists (81% ± 0.21, P = .43). Both models equaled radiologists in predicting contrast media administration (LLama 3.1 RAG: 94% vs. 91% ± 0.2, P = .37, GPT-4o RAG: 92% vs. 91% ± 0.24, P = .48).

Conclusion: Large language models show great potential as decision-support tools for MRI protocoling, with performance similar to radiologists. RAG enhances the ability of LLMs to provide accurate, institution-specific protocol recommendations.

Download full-text PDF	Source
http://dx.doi.org/10.1007/s11547-025-02040-9	DOI Listing

Publication Analysis

Top Keywords

mri protocoling

large language

language models

automated mri

protocoling neuroradiology

neuroradiology era

era large

models purpose

purpose study

study investigates

Similar Publications

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B.

Eur Radiol

September 2025

Institute of Diagnostic and Interventional Neuroradiology, TUM University Hospital, School of Medicine and Health, Technical University of Munich, Munich, Germany.

Su Hwan Kim , Severin Schramm , Lena Schmitzer , Kerem Serguen , Sebastian Ziegelmayer

Objectives: To evaluate the potential of LLMs to generate sequence-level brain MRI protocols.

Materials And Methods: This retrospective study employed a dataset of 150 brain MRI cases derived from local imaging request forms. Reference protocols were established by two neuroradiologists.

View Article and Find Full Text PDF

Similar Publications

Automated MRI protocoling in neuroradiology in the era of large language models.

Radiol Med

July 2025

Department of Neuroradiology, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353, Berlin, Germany.

Lara Noelle Reiner , Moudather Chelbi , Leonard Fetscher , Juliane C Stöckel , Christoph Csapó-Schmidt

View Article and Find Full Text PDF

Similar Publications

An Institutional Large Language Model for Musculoskeletal MRI Improves Protocol Adherence and Accuracy.

J Bone Joint Surg Am

July 2025

Department of Diagnostic Imaging, National University Hospital, Singapore.

James Thomas Patrick Decourcy Hallinan , Naomi Wenxin Leow , Yi Xian Low , Aric Lee , Wilson Ong

Background: Privacy-preserving large language models (PP-LLMs) hold potential for assisting clinicians with documentation. We evaluated a PP-LLM to improve the clinical information on radiology request forms for musculoskeletal magnetic resonance imaging (MRI) and to automate protocoling, which ensures that the most appropriate imaging is performed.

Methods: The present retrospective study included musculoskeletal MRI radiology request forms that had been randomly collected from June to December 2023.

View Article and Find Full Text PDF

Similar Publications

Machine Learning and Deep Learning Models for Automated Protocoling of Emergency Brain MRI Using Text from Clinical Referrals.

Radiol Artif Intell

May 2025

Department of Radiology, Turku University Hospital & University of Turku, Kiinamyllynkatu 4-8, 20521 Turku, Finland.

Heidi J Huhtanen , Mikko J Nyman , Antti Karlsson , Jussi Hirvonen

Purpose To develop and evaluate machine learning and deep learning-based models for automated protocoling of emergency brain MRI scans based on clinical referral text. Materials and Methods In this single-institution, retrospective study of 1953 emergency brain MRI referrals from January 2016 to January 2019, two neuroradiologists labeled the imaging protocol and use of contrast agent as the reference standard. Three machine learning algorithms (naive Bayes, support vector machine, and XGBoost) and two pretrained deep learning models (Finnish bidirectional encoder representations from transformers [BERT] and generative pretrained transformer [GPT]-3.

View Article and Find Full Text PDF

Similar Publications

Abdominal and Pelvic MRI Protocol Prediction Using Natural Language Processing.

J Imaging Inform Med

January 2025

Department of Radiology, Mayo Clinic, Rochester, MN, USA.

Joshua D Warner , Robert P Hartman , Daniel J Blezek , John V Thomas

Exam protocoling is a significant non-interpretive task burden for radiologists. The purpose of this work was to develop a natural language processing (NLP) artificial intelligence (AI) solution for automated protocoling of standard abdomen and pelvic magnetic resonance imaging (MRI) exams from basic associated order information and patient metadata. This Institutional Review Board exempt retrospective study used de-identified metadata from consecutive adult abdominal and pelvic MRI scans performed at our institution spanning 2.

View Article and Find Full Text PDF

Similar Publications