Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Artificial intelligence (AI) technology can enable more efficient decision-making in healthcare settings. There is a growing interest in improving the speed and accuracy of AI systems in providing responses for given tasks in healthcare settings.

Objective: This study aimed to assess the reliability of ChatGPT in determining emergency department (ED) triage accuracy using the Korean Triage and Acuity Scale (KTAS).

Methods: Two hundred and two virtual patient cases were built. The gold standard triage classification for each case was established by an experienced ED physician. Three other human raters (ED paramedics) were involved and rated the virtual cases individually. The virtual cases were also rated by two different versions of the chat generative pre-trained transformer (ChatGPT, 3.5 and 4.0). Inter-rater reliability was examined using Fleiss' kappa and intra-class correlation coefficient (ICC).

Results: The kappa values for the agreement between the four human raters and ChatGPTs were .523 (version 4.0) and .320 (version 3.5). Of the five levels, the performance was poor when rating patients at levels 1 and 5, as well as case scenarios with additional text descriptions. There were differences in the accuracy of the different versions of GPTs. The ICC between version 3.5 and the gold standard was .520, and that between version 4.0 and the gold standard was .802.

Conclusions: A substantial level of inter-rater reliability was revealed when GPTs were used as KTAS raters. The current study showed the potential of using GPT in emergency healthcare settings. Considering the shortage of experienced manpower, this AI method may help improve triaging accuracy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10798071PMC
http://dx.doi.org/10.1177/20552076241227132DOI Listing

Publication Analysis

Top Keywords

gold standard
12
reliability chatgpt
8
emergency department
8
korean triage
8
triage acuity
8
acuity scale
8
healthcare settings
8
human raters
8
virtual cases
8
inter-rater reliability
8

Similar Publications

Background: Patent foramen ovale (PFO) has been identified as a potential risk factor for cryptogenic stroke (CS). Although transesophageal echocardiography (TEE) is considered the gold standard for PFO detection, false-negative results remain a clinical concern, particularly in CS patients with high suspicion of PFO-related etiology.

Aims: To evaluate the clinical utility of transcatheter PFO exploration (TPFOE) in CS patients with negative TEE findings but high suspicion of PFO-related etiology.

View Article and Find Full Text PDF

This rapid systematic review aimed to evaluate the diagnostic accuracy (concurrent validity, predictive ability, reliability) of indirect calorimetry (IC) for measuring resting energy expenditure (REE) in adults with overweight or obesity. PubMed and Web of Science searched for studies measuring REE by IC in adults with overweight or obesity and reported primary outcomes: concurrent validity, predictive ability, or reliability. N = 22 studies were included that evaluated n = 10 IC devices.

View Article and Find Full Text PDF

Pseudomonas aeruginosa (PA) represents a major cause of antimicrobial resistance-related morbidity and mortality. The recent emergence of highly fatal infections, caused by carbapenem-resistant PA, has called for novel antimicrobial therapies and strategies. In this study, we highlight the therapeutic potential of ε-poly-L-lysine (εPL), an antimicrobial polymer for treating extensively-and pan-drug-resistant-PA.

View Article and Find Full Text PDF

[Cough frequency monitoring: current technologies and clinical research applications].

Zhonghua Jie He He Hu Xi Za Zhi

September 2025

Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Guangzhou Medical University, National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory He

Cough is a common symptom of many respiratory diseases, and parameters such as frequency, intensity, type and duration play important roles in disease screening, diagnosis and prognosis. Among these, cough frequency is the most widely applied metric. In current clinical practice, cough severity is primarily assessed based on patients' subjective symptom descriptions in combination with semi-structured questionnaires.

View Article and Find Full Text PDF

Isolated Congenital Middle Ear Malformations: Comparison of preoperative 0.1 mm Ultra-High-Resolution CT and Conventional High-Resolution CT.

AJNR Am J Neuroradiol

September 2025

From the Department of Otorhinolaryngology Head and Neck Surgery (J.G., Y.L., S.G.) and Department of Radiology (N.X., R.T., H.D.,Z.Y., Z.W., P.Z.), Beijing Friendship Hospital, Capital Medical University, Beijing, China.

Background And Purpose: Isolated congenital middle ear malformation contributes significantly to congenital hearing loss and growth problems. This study aims to compare 0.1 mm isotropic ultra-high-resolution computed tomography and conventional high-resolution computed tomography for assessing isolated congenital middle ear malformation, using surgical exploration as the gold standard.

View Article and Find Full Text PDF