A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Evaluating large language models in echocardiography reporting: opportunities and challenges. | LitMetric

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Aims: The increasing need for diagnostic echocardiography tests presents challenges in preserving the quality and promptness of reports. While Large Language Models (LLMs) have proven effective in summarizing clinical texts, their application in echo remains underexplored.

Methods And Results: Adult echocardiography studies, conducted at the Mayo Clinic from 1 January 2017 to 31 December 2017, were categorized into two groups: development (all Mayo locations except Arizona) and Arizona validation sets. We adapted open-source LLMs (Llama-2, MedAlpaca, Zephyr, and Flan-T5) using In-Context Learning and Quantized Low-Rank Adaptation fine-tuning (FT) for echo report summarization from 'Findings' to 'Impressions.' Against cardiologist-generated Impressions, the models' performance was assessed both quantitatively with automatic metrics and qualitatively by cardiologists. The development dataset included 97 506 reports from 71 717 unique patients, predominantly male (55.4%), with an average age of 64.3 ± 15.8 years. EchoGPT, a fine-tuned Llama-2 model, outperformed other models with win rates ranging from 87% to 99% in various automatic metrics, and produced reports comparable to cardiologists in qualitative review (significantly preferred in conciseness ( < 0.001), with no significant preference in completeness, correctness, and clinical utility). Correlations between automatic and human metrics were fair to modest, with the best being RadGraph F1 scores vs. clinical utility ( = 0.42) and automatic metrics showed insensitivity (0-5% drop) to changes in measurement numbers.

Conclusion: EchoGPT can generate draft reports for human review and approval, helping to streamline the workflow. However, scalable evaluation approaches dedicated to echo reports remains necessary.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12088711PMC
http://dx.doi.org/10.1093/ehjdh/ztae086DOI Listing

Publication Analysis

Top Keywords

automatic metrics
12
large language
8
language models
8
clinical utility
8
reports
5
evaluating large
4
models echocardiography
4
echocardiography reporting
4
reporting opportunities
4
opportunities challenges
4

Similar Publications