An Institutional Large Language Model for Musculoskeletal MRI Improves Protocol Adherence and Accuracy.

James Thomas Patrick Decourcy Hallinan , Naomi Wenxin Leow , Yi Xian Low , Aric Lee , Wilson Ong , Matthew Ding Zhou Chan , Ganakirthana Kalpenya Devi , Stephanie Shengjie He , Daniel De-Liang Loh , Desmond Shi Wei Lim , Xi Zhen Low , Ee Chin Teo , Shaheryar Mohammad Furqan , Wilson Wei Yang Tham , Jiong Hao Tan , Naresh Kumar , Andrew Makmur , Ting Yonghan

J Bone Joint Surg Am

Department of Diagnostic Imaging, National University Hospital, Singapore.

Published: July 2025

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Background: Privacy-preserving large language models (PP-LLMs) hold potential for assisting clinicians with documentation. We evaluated a PP-LLM to improve the clinical information on radiology request forms for musculoskeletal magnetic resonance imaging (MRI) and to automate protocoling, which ensures that the most appropriate imaging is performed.

Methods: The present retrospective study included musculoskeletal MRI radiology request forms that had been randomly collected from June to December 2023. Studies without electronic medical record (EMR) entries were excluded. An institutional PP-LLM (Claude Sonnet 3.5) augmented the original radiology request forms by mining EMRs, and, in combination with rule-based processing of the LLM outputs, suggested appropriate protocols using institutional guidelines. Clinical information on the original and PP-LLM radiology request forms were compared with use of the RI-RADS (Reason for exam Imaging Reporting and Data System) grading by 2 musculoskeletal (MSK) radiologists independently (MSK1, with 13 years of experience, and MSK2, with 11 years of experience). These radiologists established a consensus reference standard for protocoling, against which the PP-LLM and of 2 second-year board-certified radiologists (RAD1 and RAD2) were compared. Inter-rater reliability was assessed with use of the Gwet AC1, and the percentage agreement with the reference standard was calculated.

Results: Overall, 500 musculoskeletal MRI radiology request forms were analyzed for 407 patients (202 women and 205 men with a mean age [and standard deviation] of 50.3 ± 19.5 years) across a range of anatomical regions, including the spine/pelvis (143 MRI scans; 28.6%), upper extremity (169 scans; 33.8%) and lower extremity (188 scans; 37.6%). Two hundred and twenty-two (44.4%) of the 500 MRI scans required contrast. The clinical information provided in the PP-LLM-augmented radiology request forms was rated as superior to that in the original requests. Only 0.4% to 0.6% of PP-LLM radiology request forms were rated as limited/deficient, compared with 12.4% to 22.6% of the original requests (p < 0.001). Almost-perfect inter-rater reliability was observed for LLM-enhanced requests (AC1 = 0.99; 95% confidence interval [CI], 0.99 to 1.0), compared with substantial agreement for the original forms (AC1 = 0.62; 95% CI, 0.56 to 0.67). For protocoling, MSK1 and MSK2 showed almost-perfect agreement on the region/coverage (AC1 = 0.96; 95% CI, 0.95 to 0.98) and contrast requirement (AC1 = 0.98; 95% CI, 0.97 to 0.99). Compared with the consensus reference standard, protocoling accuracy for the PP-LLM was 95.8% (95% CI, 94.0% to 97.6%), which was significantly higher than that for both RAD1 (88.6%; 95% CI, 85.8% to 91.4%) and RAD2 (88.2%; 95% CI, 85.4% to 91.0%) (p < 0.001 for both).

Conclusions: Musculoskeletal MRI request form augmentation with an institutional LLM provided superior clinical information and improved protocoling accuracy compared with clinician requests and non-MSK-trained radiologists. Institutional adoption of such LLMs could enhance the appropriateness of MRI utilization and patient care.

Level Of Evidence: Diagnostic Level III . See Instructions for Authors for a complete description of levels of evidence.

Download full-text PDF	Source
http://dx.doi.org/10.2106/JBJS.24.01429	DOI Listing

Publication Analysis

Top Keywords

radiology request

request forms

musculoskeletal mri

reference standard

large language

mri

request

forms

mri radiology

pp-llm radiology

Similar Publications

Value of thoracic ultrasound including focused cardiac ultrasound in daily practice of outpatient chest clinic.

Multidiscip Respir Med

September 2025

Department of Chest Diseases, Faculty of Medicine, Al-Azhar University, Cairo, Egypt.

Moaz Atef , Houssam Eldin Hassanin , Ahmed M Ewis , Ahmed A Hassan , Ashraf Moursi

Background: Chest examination alone may be insufficient to declare cardiorespiratory diseases specially in its early stages and/or silent forms, also it is impractical for the CXR and cardiac consultation to be requested for every patient in the outpatient clinic, therefore involving the chest US and FoCUS (Focused Cardiac Ultra Sound) examination in the bedside practice of outpatient chest clinic may influence the clinical diagnosis and management plan.

Objective: To determine how the bedside thoracic US including FoCUS can alter the clinical diagnosis in patients who are clinically diagnosed as acute bronchitis in the outpatient chest clinic.

Subjects And Methods: This study was conducted at Chest outpatient clinic, Al-Azhar University in the period between January 2024 to March 2025.

View Article and Find Full Text PDF

Similar Publications

Health risk assessment for severe COVID-19 in Taiwan: a multi-centre electronic health record study.

J Glob Health

September 2025

International Ph.D. Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan.

Yu-Hui Chang , Whitney Burton , Phung-Anh Nguyen , Do Duy Khang , Chang-I Chen

Background: As the global battle against COVID-19 continues, understanding the factors contributing to severe outcomes remains critical for public health strategies. We aim to identify the determinants significantly influencing severe COVID-19 infection and mortality among the general population in Taiwan.

Methods: We conducted a retrospective cohort study using data extracted from the Taipei Medical University Clinical Research Database from 1 January 2022 to 31 December 2022.

View Article and Find Full Text PDF

Similar Publications

Risk-minimizing tube current and tube voltage modulation for CT: A simulation study.

Med Phys

August 2025

Division of X-Ray Imaging and CT, German Cancer Research Center (DKFZ), Heidelberg, Germany.

Edith Baader , Marc Kachelrieß

Background: The optimal tube voltage in clinical CT depends on the patient's attenuation and the imaging task. Although the patient's attenuation changes with view angle and longitudinal position of the X-ray tube, the tube voltage remains constant throughout the scan in current clinical practice. In general, the optimum tube voltage increases with patient diameter.

View Article and Find Full Text PDF

Similar Publications

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B.

Eur Radiol

September 2025

Institute of Diagnostic and Interventional Neuroradiology, TUM University Hospital, School of Medicine and Health, Technical University of Munich, Munich, Germany.

Su Hwan Kim , Severin Schramm , Lena Schmitzer , Kerem Serguen , Sebastian Ziegelmayer

Objectives: To evaluate the potential of LLMs to generate sequence-level brain MRI protocols.

Materials And Methods: This retrospective study employed a dataset of 150 brain MRI cases derived from local imaging request forms. Reference protocols were established by two neuroradiologists.

View Article and Find Full Text PDF

Similar Publications

How Do Patient Outcomes in Mechanical Thrombectomy for Large-Core Stroke Vary Based on Neuroimaging Modalities Used for Patient Selection? A Multicenter Multinational Study.

Transl Stroke Res

September 2025

Neurosurgical Service, Beth Israel Deaconess Medical Center, Harvard Medical School, 110 Francis Street, Boston, MA, 02115, USA.

Omar Alwakaa , Rahim Abo Kasem , Felipe Ramirez-Velandia , Aryan Wadhwa , Kimberly Han

The role of different imaging modalities-non-contrast CT (NCCT), CT perfusion (CTP), and diffusion-weighted imaging (DWI)-in selecting patients with large-core stroke for endovascular thrombectomy (EVT) is a subject of ongoing debate. This study aims to determine whether patients with large-core acute ischemic stroke (AIS) undergoing EVT triaged with CTP or DWI in addition to NCCT had different clinical outcomes compared to those only triaged with NCCT. We queried the Stroke Thrombectomy and Aneurysm Registry (STAR) for patients enrolled between 2014 and 2023 who presented with anterior-circulation AIS and large ischemic core (ASPECTS < 6) who underwent EVT in 41 stroke centers in the USA, Europe, Asia, and South America.

View Article and Find Full Text PDF

Similar Publications