Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Objectives: To compare the quality and time efficiency of physician-written summaries with customised large language model (LLM)-generated medical summaries integrated into the electronic health record (EHR) in a non-English clinical environment.

Design: Cross-sectional non-inferiority validation study.

Setting: Tertiary academic hospital.

Participants: 52 physicians from 8 specialties at a large Dutch academic hospital participated, either in writing summaries (n=42) or evaluating them (n=10).

Interventions: Physician writers wrote summaries of 50 patient records. LLM-generated summaries were created for the same records using an EHR-integrated LLM. An independent, blinded panel of physician evaluators compared physician-written summaries to LLM-generated summaries.

Primary And Secondary Outcome Measures: Primary outcome measures were completeness, correctness and conciseness (on a 5-point Likert scale). Secondary outcomes were preference and trust, and time to generate either the physician-written or LLM-generated summary.

Results: The completeness and correctness of LLM-generated summaries did not differ significantly from physician-written summaries. However, LLM summaries were less concise (3.0 vs 3.5, p=0.001). Overall evaluation scores were similar (3.4 vs 3.3, p=0.373), with 57% of evaluators preferring LLM-generated summaries. Trust in both summary types was comparable, and interobserver variability showed excellent reliability (intraclass correlation coefficient 0.975). Physicians took an average of 7 min per summary, while LLMs completed the same task in just 15.7 s.

Conclusions: LLM-generated summaries are comparable to physician-written summaries in completeness and correctness, although slightly less concise. With a clear time-saving benefit, LLMs could help reduce clinicians' administrative burden without compromising summary quality.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12414186PMC
http://dx.doi.org/10.1136/bmjopen-2025-099301DOI Listing

Publication Analysis

Top Keywords

physician-written summaries
20
llm-generated summaries
16
summaries
14
completeness correctness
12
customised large
8
large language
8
outcome measures
8
llm-generated
7
physician-written
6
quality efficiency
4

Similar Publications

Objectives: To compare the quality and time efficiency of physician-written summaries with customised large language model (LLM)-generated medical summaries integrated into the electronic health record (EHR) in a non-English clinical environment.

Design: Cross-sectional non-inferiority validation study.

Setting: Tertiary academic hospital.

View Article and Find Full Text PDF

This study explores the use of open-source large language models (LLMs) to automate generation of German discharge summaries from structured clinical data. The structured data used to produce AI-generated summaries were manually extracted from electronic health records (EHRs) by a trained medical professional. By leveraging structured documentation collected for research and quality management, the goal is to assist physicians with editable draft summaries.

View Article and Find Full Text PDF

Background: The rapid development of artificial intelligence (AI) has shown great potential in medical document generation. This study aims to evaluate the performance of Claude 3.5-Sonnet, an advanced AI model, in generating discharge summaries for patients with renal insufficiency, compared to human physicians.

View Article and Find Full Text PDF

Developing and Evaluating Large Language Model-Generated Emergency Medicine Handoff Notes.

JAMA Netw Open

December 2024

Department of Emergency Medicine, NewYork-Presbyterian/Weill Cornell Medicine, New York.

Importance: An emergency medicine (EM) handoff note generated by a large language model (LLM) has the potential to reduce physician documentation burden without compromising the safety of EM-to-inpatient (IP) handoffs.

Objective: To develop LLM-generated EM-to-IP handoff notes and evaluate their accuracy and safety compared with physician-written notes.

Design, Setting, And Participants: This cohort study used EM patient medical records with acute hospital admissions that occurred in 2023 at NewYork-Presbyterian/Weill Cornell Medical Center.

View Article and Find Full Text PDF

Objective: To evaluate the clinical applications and limitations of chat generative pretrained transformer (ChatGPT) in otolaryngology.

Study Design: Cross-sectional survey.

Setting: Tertiary academic center.

View Article and Find Full Text PDF