A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

ChatGPT: Evaluating answers on contrast media related questions and finetuning by providing the model with the ESUR guideline on contrast agents. | LitMetric

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Objective: This study aimed to assess the feasibility of GPT-4 for answering questions related to contrast media with and without the context of the European Society of Urogenital Radiology (ESUR) guideline on contrast agents. The overarching goal was to determine whether contextual enrichment by providing guideline information improves answers of GPT-4 for clinical decision-making in radiology.

Methods: A set of 64 questions, based on the ESUR guideline on contrast agents mirroring pertinent sections, was developed and posed to GPT-4 both directly and after providing the guideline using a plugin. Responses were graded by experienced radiologists for quality of information and accuracy in pinpointing information from the guideline as well as by radiology residents for utility, using Likert-scales.

Results: GPT-4's performance improved significantly with the guideline. Without the guideline, average quality rating was 3.98, which increased to 4.33 with the guideline (p = 0036). In terms of accuracy, 82.3% of answers matched the information from the guideline. Utility scores also reflected a significant improvement with the guideline, with average scores of 4.1 (without) and 4.4 (with) (p = 0.008) with a Fleiss´ Kappa of 0.44.

Conclusion: GPT-4, when contextually enriched with a guideline, demonstrates enhanced capability in providing guideline-backed recommendations. This approach holds promise for real-time clinical decision-support, making guidelines more actionable. However, further refinements are necessary to maximize the potential of large language models (LLMs). Inherent limitations need to be addressed.

Download full-text PDF

Source
http://dx.doi.org/10.1067/j.cpradiol.2024.04.005DOI Listing

Publication Analysis

Top Keywords

guideline
12
esur guideline
12
guideline contrast
12
contrast agents
12
contrast media
8
providing guideline
8
guideline average
8
contrast
5
chatgpt evaluating
4
evaluating answers
4

Similar Publications