98%
921
2 minutes
20
Introduction: The artificial intelligence language model Chat Generative Pretrained Transformer (ChatGPT) has shown potential as a reliable and accessible educational resource in orthopaedic surgery. Yet, the accuracy of the references behind the provided information remains elusive, which poses a concern for maintaining the integrity of medical content. This study aims to examine the accuracy of the references provided by ChatGPT-4 concerning the Airway, Breathing, Circulation, Disability, Exposure (ABCDE) approach in trauma surgery.
Methods: Two independent reviewers critically assessed 30 ChatGPT-4-generated references supporting the well-established ABCDE approach to trauma protocol, grading them as 0 (nonexistent), 1 (inaccurate), or 2 (accurate). All discrepancies between the ChatGPT-4 and PubMed references were carefully reviewed and bolded. Cohen's Kappa coefficient was used to examine the agreement of the accuracy scores of the ChatGPT-4-generated references between reviewers. Descriptive statistics were used to summarize the mean reference accuracy scores. To compare the variance of the means across the 5 categories, one-way analysis of variance was used.
Results: ChatGPT-4 had an average reference accuracy score of 66.7%. Of the 30 references, only 43.3% were accurate and deemed "true" while 56.7% were categorized as "false" (43.3% inaccurate and 13.3% nonexistent). The accuracy was consistent across the 5 trauma protocol categories, with no significant statistical difference (p = 0.437).
Discussion: With 57% of references being inaccurate or nonexistent, ChatGPT-4 has fallen short in providing reliable and reproducible references-a concerning finding for the safety of using ChatGPT-4 for professional medical decision making without thorough verification. Only if used cautiously, with cross-referencing, can this language model act as an adjunct learning tool that can enhance comprehensiveness as well as knowledge rehearsal and manipulation.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11368215 | PMC |
http://dx.doi.org/10.2106/JBJS.OA.24.00099 | DOI Listing |
J Alzheimers Dis
September 2025
Paula Costa-Urrutia Medical Affairs, Terumo BCT, Edificio Think MVD, Montevideo, Uruguay.
BackgroundTherapeutic plasma exchange (TPE) with albumin replacement has emerged as a potential treatment for Alzheimer's disease (AD). The AMBAR trial showed that TPE could slow cognitive and functional decline, along with changes in core and inflammatory biomarkers in cerebrospinal fluid.ObjectiveTo evaluate the safety and effectiveness of TPE in a real-world setting in Argentina.
View Article and Find Full Text PDFCereb Cortex
August 2025
Department of Psychology, University of Milano-Bicocca, Milan, Italy.
Semantic composition allows us to construct complex meanings (e.g., "dog house", "house dog") from simpler constituents ("dog", "house").
View Article and Find Full Text PDFInt J Surg
September 2025
The Third Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China.
Int J Surg
September 2025
Digestive Endoscopy Center, Shanghai Tenth People's Hospital, Tongji University School of Medicine, Shanghai, China.
Background: Patients with T1 colorectal cancer (CRC) often show poor adherence to guideline-recommended treatment strategies after endoscopic resection. To address this challenge and improve clinical decision-making, this study aims to compare the accuracy of surgical management recommendations between large language models (LLMs) and clinicians.
Methods: This retrospective study enrolled 202 patients with T1 CRC who underwent endoscopic resection at three hospitals.
J Chem Inf Model
September 2025
Songshan Lake Materials Laboratory, Dongguan 523808, PR China.
Large language models (LLMs) have demonstrated transformative potential for materials discovery in condensed matter systems, but their full utility requires both broader application scenarios and integration with ab initio crystal structure prediction (CSP), density functional theory (DFT) methods and domain knowledge to benefit future inverse material design. Here, we develop an integrated computational framework combining language model-guided materials screening with genetic algorithm (GA) and graph neural network (GNN)-based CSP methods to predict new photovoltaic material. This LLM + CSP + DFT approach successfully identifies a previously overlooked oxide material with unexpected photovoltaic potential.
View Article and Find Full Text PDF