Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Large Language Models (LLMs) hold potential as clinical decision support tools, particularly when integrated with domain-specific knowledge. In radiology, there is limited research on LLMs for assessing imaging appropriateness. This study evaluates a contextualized GPT-4-based LLM's performance in assessing the appropriateness of musculoskeletal MRI scan requests with standard models and different versions of optimization. The LLMs' performances was also compared against human clinicians with varying experience (two radiology residents, two subspecialist attendings, an orthopaedic surgeon). Using a retrieval-augmented generation framework, the LLM was provided with a domain-specific knowledge base from 33 American College of Radiology Appropriateness Criteria guidelines. A test dataset of 70 fictional case scenarios was created, including cases with insufficient clinical information. Quantitative analysis using the McNemar mid-P test revealed that the optimized LLM achieved 92.86% accuracy, significantly outperforming the baseline model (61.43%, P < .001) and the standard GPT-4 model (51.29%, P < .001). The optimized model also excelled in identifying cases with insufficient clinical information. In comparison to human clinicians, the optimized LLM performed better than all but one radiologist. This study demonstrates that with contextualization and optimization, GPT-4-based LLMs can improve performance in assessing imaging appropriateness and show promise as clinical decision support tools in radiology.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11871060PMC
http://dx.doi.org/10.1038/s41598-025-88925-1DOI Listing

Publication Analysis

Top Keywords

performance assessing
8
musculoskeletal mri
8
mri scan
8
appropriateness criteria
8
domain-specific knowledge
8
appropriateness
5
chatgpt performance
4
assessing musculoskeletal
4
scan appropriateness
4
appropriateness based
4

Similar Publications

An aptasensor-based fluorescent signal amplification strategy for highly sensitive detection of mycotoxins.

Anal Methods

September 2025

Key Laboratory of Biorheological Science and Technology of Ministry of Education, College of Bioengineering, Chongqing University, Chongqing 400044, P. R. China.

Aflatoxin B1 (AFB1) is one of the most toxic mycotoxins that pose great health threats to humans. Herein, an aptasensor-based fluorescent signal amplification strategy is developed for the detection of AFB1. Initially, the AFB1 aptamers labelled with carboxyfluorescein (FAM) are adsorbed onto graphene oxide (GO), triggering energy transfer.

View Article and Find Full Text PDF

The effect of non-functionalized polystyrene nanoparticles (PS-NPs) with diameters of 29, 44, and 72 nm on plasmid DNA integrity and the expression of genes involved in the architecture of chromatin was investigated in human peripheral blood mononuclear cells (PBMCs). The cells were incubated with PS-NPs at concentrations ranging from 0.001 to 100 µg/mL for 24 hours.

View Article and Find Full Text PDF

Background: Intensive language-action therapy treats language deficits and depressive symptoms in chronic poststroke aphasia, yet the underlying neural mechanisms remain underexplored. Long-range temporal correlations (LRTCs) in blood oxygenation level-dependent signals indicate persistence in brain activity patterns and may relate to learning and levels of depression. This observational study investigates blood oxygenation level-dependent LRTC changes alongside therapy-induced language and mood improvements in perisylvian and domain-general brain areas.

View Article and Find Full Text PDF

Objectives: The risk of major venous thromboembolism (VTE) among patients with COVID-19 is high but varies with disease severity. Estimate the incidence of lower extremity deep venous thrombosis (DVT) in critically ill hospitalized patients with COVID-19, validate the Wells score for DVT diagnosis, and determine patients' prognosis.

Methods: This was an observational follow-up study in the context of the diagnosis and prognosis of DVT.

View Article and Find Full Text PDF

Background: Poststroke cognitive impairment (PSCI) affects 30% to 50% of stroke survivors, severely impacting functional outcomes and quality of life. This study uses functional near-infrared spectroscopy (fNIRS) to assess task-evoked brain activation and its potential for stratifying the severity in patients with PSCI.

Method: A cross-sectional study was conducted at Nanchong Central Hospital between June 2023 and April 2024.

View Article and Find Full Text PDF