98%
921
2 minutes
20
Importance: Large language model (LLM) artificial intelligence (AI) systems have shown promise in diagnostic reasoning, but their utility in management reasoning with no clear right answers is unknown.
Objective: To determine whether LLM assistance improves physician performance on open-ended management reasoning tasks compared to conventional resources.
Design: Prospective, randomized controlled trial conducted from 30 November 2023 to 21 April 2024.
Setting: Multi-institutional study from Stanford University, Beth Israel Deaconess Medical Center, and the University of Virginia involving physicians from across the United States.
Participants: 92 practicing attending physicians and residents with training in internal medicine, family medicine, or emergency medicine.
Intervention: Five expert-developed clinical case vignettes were presented with multiple open-ended management questions and scoring rubrics created through a Delphi process. Physicians were randomized to use either GPT-4 via ChatGPT Plus in addition to conventional resources (e.g., UpToDate, Google), or conventional resources alone.
Main Outcomes And Measures: The primary outcome was difference in total score between groups on expert-developed scoring rubrics. Secondary outcomes included domain-specific scores and time spent per case.
Results: Physicians using the LLM scored higher compared to those using conventional resources (mean difference 6.5 %, 95% CI 2.7-10.2, p<0.001). Significant improvements were seen in management decisions (6.1%, 95% CI 2.5-9.7, p=0.001), diagnostic decisions (12.1%, 95% CI 3.1-21.0, p=0.009), and case-specific (6.2%, 95% CI 2.4-9.9, p=0.002) domains. GPT-4 users spent more time per case (mean difference 119.3 seconds, 95% CI 17.4-221.2, p=0.02). There was no significant difference between GPT-4-augmented physicians and GPT-4 alone (-0.9%, 95% CI -9.0 to 7.2, p=0.8).
Conclusions And Relevance: LLM assistance improved physician management reasoning compared to conventional resources, with particular gains in contextual and patient-specific decision-making. These findings indicate that LLMs can augment management decision-making in complex cases.
Trial Registration: ClinicalTrials.gov Identifier: NCT06208423; https://classic.clinicaltrials.gov/ct2/show/NCT06208423.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11326321 | PMC |
http://dx.doi.org/10.1101/2024.08.05.24311485 | DOI Listing |
Med Trop Sante Int
July 2025
Unité des maladies infectieuses et tropicales et CIC Inserm 1424, Centre hospitalier de Cayenne, Cayenne, Guyane.
Tahiti or the "myth of Paradise", Bora Bora, "the Pearl of the Pacific". Who has never wanted to take a plane and come and land on the heavenly beaches of Polynesia, a French territory at the antipodes of mainland France lost in the middle of the Pacific? However, we do not imagine that 60% of Polynesians live below the metropolitan low-income threshold or that life expectancy is lower than that of the mainland due to the high prevalence of cardiovascular diseases with three quarters overweight population.In addition to non-transmissible metabolic diseases, various pathologies common to temperate countries present specificities in Polynesia, leading to sometimes different management and medical reasoning.
View Article and Find Full Text PDFMedEdPublish (2016)
May 2025
Newcastle University Faculty of Medical Sciences, Newcastle upon Tyne, England, UK.
Background: Whilst debriefing literature offers valuable tools for healthcare education, there remains a gap in resources specifically designed for debriefing communication skills. Effective communication is fundamental to patient care, particularly during sensitive interactions. This article provides a specialised toolkit for educators to enhance communication skills debriefing, developed through synthesis of existing literature and the authors' extensive experience teaching communication skills through simulation.
View Article and Find Full Text PDFActa Anaesthesiol Scand
October 2025
Copenhagen Trial Unit, Centre for Clinical Intervention Research, The Capital Region, Copenhagen University Hospital - Rigshospitalet, Copenhagen, Denmark.
Introduction: Electronic health records can be used to create high-quality databases if data are structured and well-registered, which is the case for most perioperative data in the Capital and Zealand Regions of Denmark. We present the purpose and development of the AI and Automation in Anaesthesia (TRIPLE-A) database-a platform designed for epidemiology, prediction, quality control, and automated research data collection.
Methods: Data collection from the electronic medical record (EPIC Systems Corporation, WI, USA) was approved by the Capital Region, Denmark, and ethical approval was waived.
IEEE Trans Comput Biol Bioinform
September 2025
Artificial intelligence (AI) based anticancer drug recommendation systems have emerged as powerful tools for precision dosing. Although existing methods have advanced in terms of predictive accuracy, they encounter three significant obstacles, including the "black-box" problem resulting in unexplainable reasoning, the computational difficulty for graphbased structures, and the combinatorial explosion during multistep reasoning. To tackle these issues, we introduce a novel Macro-Micro agent Drug sensitivity inference (MarMirDrug).
View Article and Find Full Text PDFJ Forensic Leg Med
September 2025
Pathology and Lab Medicine, AIIMS, Bibinagar, Telangana, India.
The R-I-M-E framework, an acronym for Reporter, Interpreter, Manager, and Educator, is a developmental model designed to evaluate and enhance clinical competence among medical trainees. Introduced by Dr. Louis Pangaro in the 1990s, it has gained widespread acceptance in medical education due to its clarity, structured approach, and applicability across specialties.
View Article and Find Full Text PDF