A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Assessing the performance of ChatGPT in medical ethical decision-making: a comparative study with USMLE-based scenarios. | LitMetric

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Introduction: The integration of artificial intelligence (AI) into healthcare introduces innovative possibilities but raises ethical, legal and professional concerns. Assessing the performance of AI in core components of the United States Medical Licensing Examination (USMLE), such as communication skills, ethics, empathy and professionalism, is crucial. This study evaluates how well ChatGPT versions 3.5 and 4.0 handle complex medical scenarios using USMLE-Rx, AMBOSS and UWorld question banks, aiming to understand its ability to navigate patient interactions according to medical ethics and standards.

Methods: We compiled 273 questions from AMBOSS, USMLE-Rx and UWorld, focusing on communication, social sciences, healthcare policy and ethics. GPT-3.5 and GPT-4 were tasked with answering and justifying their choices in new chat sessions to minimise model interference. Responses were compared against question bank rationales and average student performance to evaluate AI effectiveness in medical ethical decision-making.

Results: GPT-3.5 answered 38.9% correctly in AMBOSS, 54.1% in USMLE-Rx and 57.4% in UWorld, with rationale accuracy rates of 83.3%, 90.0% and 87.0%, respectively. GPT-4 answered 75.9% correctly in AMBOSS, 64.9% in USMLE-Rx and 79.6% in UWorld, with rationale accuracy rates of 85.4%, 88.9%, and 98.8%, respectively. Both versions generally scored below average student performance, except GPT-4 in UWorld.

Conclusion: ChatGPT, particularly version 4.0, shows potential in navigating ethical and interpersonal medical scenarios. However, human reasoning currently surpasses AI in average performance. Continued development and training of AI systems can enhance proficiency in these critical healthcare aspects.

Download full-text PDF

Source
http://dx.doi.org/10.1136/jme-2024-110240DOI Listing

Publication Analysis

Top Keywords

assessing performance
8
medical ethical
8
medical scenarios
8
average student
8
student performance
8
correctly amboss
8
uworld rationale
8
rationale accuracy
8
accuracy rates
8
medical
6

Similar Publications