98%
921
2 minutes
20
Background: The health care sector faces a projected shortfall of 10 million workers by 2030. Artificial intelligence (AI) automation in areas such as patient education and initial therapy screening presents a strategic response to mitigate this shortage and reallocate medical staff to higher-priority tasks. However, current methods of evaluating early-stage health care AI chatbots are highly limited due to safety concerns and the amount of time and effort that goes into evaluating them.
Objective: This study introduces a novel 3-bot method for efficiently testing and validating early-stage AI health care provider chatbots. To extensively test AI provider chatbots without involving real patients or researchers, various AI patient bots and an evaluator bot were developed.
Methods: Provider bots interacted with AI patient bots embodying frustrated, anxious, or depressed personas. An evaluator bot reviewed interaction transcripts based on specific criteria. Human experts then reviewed each interaction transcript, and the evaluator bot's results were compared to human evaluation results to ensure accuracy.
Results: The patient-education bot's evaluations by the AI evaluator and the human evaluator were nearly identical, with minimal variance, limiting the opportunity for further analysis. The screening bot's evaluations also yielded similar results between the AI evaluator and human evaluator. Statistical analysis confirmed the reliability and accuracy of the AI evaluations.
Conclusions: The innovative evaluation method ensures a safe, adaptable, and effective means to test and refine early versions of health care provider chatbots without risking patient safety or investing excessive researcher time and effort. Our patient-education evaluator bots could have benefitted from larger evaluation criteria, as we had extremely similar results from the AI and human evaluators, which could have arisen because of the small number of evaluation criteria. We were limited in the amount of prompting we could input into each bot due to the practical consideration that response time increases with larger and larger prompts. In the future, using techniques such as retrieval augmented generation will allow the system to receive more information and become more specific and accurate in evaluating the chatbots. This evaluation method will allow for rapid testing and validation of health care chatbots to automate basic medical tasks, freeing providers to address more complex tasks.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11884306 | PMC |
http://dx.doi.org/10.2196/63058 | DOI Listing |
J Appl Clin Med Phys
September 2025
Clinical Imaging Physics Group, Duke University Health System, Durham, North Carolina, USA.
Introduction: Medical physicists play a critical role in ensuring image quality and patient safety, but their routine evaluations are limited in scope and frequency compared to the breadth of clinical imaging practices. An electronic radiologist feedback system can augment medical physics oversight for quality improvement. This work presents a novel quality feedback system integrated into the Epic electronic medical record (EMR) at a university hospital system, designed to facilitate feedback from radiologists to medical physicists and technologist leaders.
View Article and Find Full Text PDFJ Intensive Care
September 2025
German Center for Vertigo and Balance Disorders, Ludwig-Maximilians-Universitat (LMU), University Hospital Grosshadern, Munich, Germany.
Background: Survivors of critical illness frequently face physical, cognitive and psychological impairments after intensive care. Sensorimotor impairments potentially have a negative impact on participation. However, comprehensive understanding of sensorimotor recovery and participation in survivors of critical illness is limited.
View Article and Find Full Text PDFGenome Biol
September 2025
Department of Clinical Pharmacy, Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
Background: Recent advances in high-throughput sequencing technologies have enabled the collection and sharing of a massive amount of omics data, along with its associated metadata-descriptive information that contextualizes the data, including phenotypic traits and experimental design. Enhancing metadata availability is critical to ensure data reusability and reproducibility and to facilitate novel biomedical discoveries through effective data reuse. Yet, incomplete metadata accompanying public omics data may hinder reproducibility and reusability and limit secondary analyses.
View Article and Find Full Text PDFBMC Vet Res
September 2025
Department of Poultry Production, Faculty of Agriculture, Fayoum University, Fayoum, 63514, Egypt.
This study investigated the impact of dietary zeolite supplementation on growth, cecal microbiota and digesta viscosity, digestive enzymes, carcass traits, blood constituents, and antioxidant parameters of broilers. A completely randomized design was used with 240 one-day-old broiler chicks randomly assigned to three dietary treatments (0%, 1.5%, and 3% zeolite as a feed additive) with four replicates of 20 chicks each.
View Article and Find Full Text PDF