Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Febrile illness in returned travellers presents a diagnostic challenge in non-endemic settings. Chat generative pretrained transformer (ChatGPT) has the potential to assist in medical tasks, yet its diagnostic performance in clinical settings has rarely been evaluated. We conducted a validation assessment of ChatGPT-4o's performance in the workup of fever in returning travellers.

Methods: We retrieved the medical records of returning travellers hospitalized with fever during 2009-2024. Their clinical scenarios at time of presentation to the emergency department were prompted to ChatGPT-4o, using a detailed uniform format. The model was further prompted with four consistent questions concerning the differential diagnosis and recommended workup. To avoid training, we kept the model blinded to the final diagnosis. Our primary outcome was ChatGPT-4o's success rates in predicting the final diagnosis when requested to specify the top three differential diagnoses. Secondary outcomes were success rates when prompted to specify the single most likely diagnosis, and all necessary diagnostics. We also assessed ChatGPT-4o as a predicting tool for malaria and qualitatively evaluated its failures.

Results: ChatGPT-4o predicted the final diagnosis in 68% [95% confidence interval (CI) 59-77%], 78% (95% CI 69-85%) and 83% (95% CI 74-89%) of the 114 cases, when prompted to specify the most likely diagnosis, top three diagnoses and all possible diagnoses, respectively. ChatGPT-4o showed a sensitivity of 100% (95% CI 93-100%) and a specificity of 94% (95% CI 85-98%) for predicting malaria. The model failed to provide the final diagnosis in 18% (20/114) of cases, primarily by failing to predict globally endemic infections (16/21, 76%).

Conclusions: ChatGPT-4o demonstrated high diagnostic accuracy when prompted with real-life scenarios of febrile returning travellers presenting to the emergency department, especially for malaria. Model training is expected to yield an improved performance and facilitate diagnostic decision-making in the field.

Download full-text PDF

Source
http://dx.doi.org/10.1093/jtm/taaf005DOI Listing

Publication Analysis

Top Keywords

final diagnosis
16
returning travellers
12
workup fever
8
fever returning
8
emergency department
8
success rates
8
top three
8
malaria model
8
diagnosis
7
diagnostic
5

Similar Publications

Importance: Patients with advanced cancer frequently receive broad-spectrum antibiotics, but changing use patterns across the end-of-life trajectory remain poorly understood.

Objective: To describe the patterns of broad-spectrum antibiotic use across defined end-of-life intervals in patients with advanced cancer.

Design, Setting, And Participants: This nationwide, population-based, retrospective cohort study used data from the South Korean National Health Insurance Service database to examine broad-spectrum antibiotic use among patients with advanced cancer who died between July 1, 2002, and December 31, 2021.

View Article and Find Full Text PDF

Importance: Patients with kidney failure (KF) receiving long-term dialysis have increased incidence of atrial fibrillation (AF). Patients with KF and AF have increased risk of stroke, death, and bleeding compared with age-matched cohorts. In KF, the use of oral anticoagulants (OACs) increases hemorrhage risk, offsetting potential benefits and making left atrial appendage occlusion (LAAO) a potentially promising solution for risk reduction in AF.

View Article and Find Full Text PDF

Objectives: Patients diagnosed with amyotrophic lateral sclerosis (ALS) typically describe symptoms of fatigue. Despite this frequency, the underlying mechanisms of fatigue are poorly understood, and are likely multifactorial. To help clarify mechanisms, the present systematic review was undertaken to determine the risk factors related to fatigue in ALS.

View Article and Find Full Text PDF

Purpose: The purpose of this study was to determine through a Delphi process a list of outcomes measures for clinicians to use when assessing individuals with Lumbar Spinal Stenosis (LSS).

Methods: A three-phase Delphi process was conducted by the International Society for the Study of the Lumbar Spine (ISSLS) Lumbar Spinal Stenosis Taskforce, including two online surveys, two virtual meetings, and three in-person consensus meetings at the ISSLS annual conferences (2023-2025). Participants evaluated and ranked outcome measures for LSS, with final endorsement requiring > 66% agreement.

View Article and Find Full Text PDF