A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Evaluating large language model-generated brain MRI protocols: performance of GPT4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B. | LitMetric

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Objectives: To evaluate the potential of LLMs to generate sequence-level brain MRI protocols.

Materials And Methods: This retrospective study employed a dataset of 150 brain MRI cases derived from local imaging request forms. Reference protocols were established by two neuroradiologists. GPT-4o, o3-mini, DeepSeek-R1 and Qwen2.5-72B were employed to generate brain MRI protocols based on the case descriptions. Protocol generation was conducted (1) with additional in-context learning involving local standard protocols (enhanced) and (2) without additional information (base). Additionally, two radiology residents independently defined MRI protocols. The sum of redundant and missing sequences (accuracy index) was defined as performance metric. Accuracy indices were compared between groups using paired t-tests.

Results: The two neuroradiologists achieved substantial inter-rater agreement (Cohen's κ = 0.74). o3-mini demonstrated superior performance (base: 2.65 ± 1.61; enhanced: 1.94 ± 1.25), followed by GPT-4o (base: 3.11 ± 1.83; enhanced: 2.23 ± 1.48), DeepSeek-R1 (base: 3.42 ± 1.84; enhanced: 2.37 ± 1.42) and Qwen2.5-72B (base: 5.95 ± 2.78; enhanced: 2.75 ± 1.54). o3-mini consistently outperformed the other models with a significant margin. All four models showed highly significant performance improvements under the enhanced condition (adj. p < 0.001 for all models). The highest-performing LLM (o3-mini [enhanced]) yielded an accuracy index comparable to residents (o3-mini [enhanced]: 1.94 ± 1.25, resident 1: 1.77 ± 1.29, resident 2: 1.77 ± 1.28).

Conclusion: Our findings demonstrate the promising potential of LLMs in automating brain MRI protocoling, especially when augmented through in-context learning. o3-mini exhibited superior performance, followed by GPT-4o.

Key Points: QuestionBrain MRI protocoling is a time-consuming, non-interpretative task, exacerbating radiologist workload. Findingso3-mini demonstrated superior brain MRI protocoling performance. All models showed notable improvements when augmented with local standard protocols. Clinical relevanceMRI protocoling is a time-intensive, non-interpretative task that adds to radiologist workload; large language models offer potential for (semi-)automation of this process.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00330-025-11989-0DOI Listing

Publication Analysis

Top Keywords

brain mri
16
mri protocols
12
o3-mini deepseek-r1
8
deepseek-r1 qwen25-72b
8
enhanced
6
mri
5
protocols
5
base
5
evaluating large
4
large language
4

Similar Publications