Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Artificial intelligence (AI) applications to medical care are currently under investigation. We aimed to evaluate and compare the quality and accuracy of physician and chatbot responses to common clinical questions in gynecologic oncology. In this cross-sectional pilot study, ten questions about the knowledge and management of gynecologic cancers were selected. Each question was answered by a recruited gynecologic oncologist, ChatGPT (Generative Pretreated Transformer) AI platform, and Bard by Google AI platform. Five recruited gynecologic oncologists who were blinded to the study design were allowed 15 min to respond to each of two questions. Chatbot responses were generated by inserting the question into a fresh session in September 2023. Qualifiers and language identifying the response source were removed. Three gynecologic oncology providers who were blinded to the response source independently reviewed and rated response quality using a 5-point Likert scale, evaluated each response for accuracy, and selected the best response for each question. Overall, physician responses were judged to be best in 76.7 % of evaluations versus ChatGPT (10.0 %) and Bard (13.3 %; p < 0.001). The average quality of responses was 4.2/5.0 for physicians, 3.0/5.0 for ChatGPT and 2.8/5.0 for Bard (-test for both and ANOVA p < 0.001). Physicians provided a higher proportion of accurate responses (86.7 %) compared to ChatGPT (60 %) and Bard (43 %; p < 0.001 for both). Physicians provided higher quality responses to gynecologic oncology clinical questions compared to chatbots. Patients should be cautioned against non-validated AI platforms for medical advice; larger studies on the use of AI for medical advice are needed.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11367046PMC
http://dx.doi.org/10.1016/j.gore.2024.101477DOI Listing

Publication Analysis

Top Keywords

chatbot responses
12
gynecologic oncology
12
pilot study
8
quality accuracy
8
accuracy physician
8
physician chatbot
8
clinical questions
8
questions gynecologic
8
recruited gynecologic
8
response source
8

Similar Publications

Background: The use of artificial intelligence platforms by medical residents as an educational resource is increasing. Within orthopaedic surgery, older Chat Generative Pre-trained Transformer (ChatGPT) models performed worse than resident physicians on practice examinations and rarely answered questions with images correctly. The newer ChatGPT-4o was designed to improve these deficiencies but has not been evaluated.

View Article and Find Full Text PDF

Background: Herein, we report on the initial development, progress, and future plans for an autonomous artificial intelligence (AI) system designed to manage major depressive disorder (MDD). The system is a web-based, patient-facing conversational AI that collects medical history, provides presumed diagnosis, recommends treatment, and coordinates care for patients with MDD.

Methods: The system includes seven components, five of which are complete and two are in development.

View Article and Find Full Text PDF

Hypoxia has been extensively studied as a stressor which pushes human bodily systems to responses and adaptations. Nevertheless, a few evidence exist onto constituent trains of motor unit action potential, despite recent advancements which allow to decompose surface electromyographic signals. This study aimed to investigate motor unit properties from noninvasive approaches during maximal isometric exercise in normobaric hypoxia.

View Article and Find Full Text PDF

[The Dawn of Generative AI in Medicine: Empathy Through Emulation].

Dtsch Med Wochenschr

September 2025

Klinik für Kardiologie, Angiologie und Pneumologie, Institut für Cardiomyopathien Heidelberg, Universitätsklinikum Heidelberg, Heidelberg, Deutschland.

Rapid advancements in Artificial Intelligence (AI) have significantly impacted multiple sectors of our society, including healthcare. While conventional AI has been instrumental in solving mainly image recognition tasks and thereby adding in well-defined situations such as supporting diagnostic imaging, the emergence of generative AI is impacting on one of the main professional competences: doctor-patient interaction.A convergence of natural language processing (NLP) and generative AI is exemplified by intelligent chatbots like ChatGPT.

View Article and Find Full Text PDF

Background: Recent studies suggest that large language models (LLMs) such as ChatGPT are useful tools for medical students or residents when preparing for examinations. These studies, especially those conducted with multiple-choice questions, emphasize that the level of knowledge and response consistency of the LLMs are generally acceptable; however, further optimization is needed in areas such as case discussion, interpretation, and language proficiency. Therefore, this study aimed to evaluate the performance of six distinct LLMs for Turkish and English neurosurgery multiple-choice questions and assess their accuracy and consistency in a specialized medical context.

View Article and Find Full Text PDF