Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Aim: This study aimed to assess the validity and reliability of AI chatbots, including Bing, ChatGPT 3.5, Google Gemini, and Claude AI, in addressing frequently asked questions (FAQs) related to dental trauma.

Methodology: A set of 30 FAQs was initially formulated by collecting responses from four AI chatbots. A panel comprising expert endodontists and maxillofacial surgeons then refined these to a final selection of 20 questions. Each question was entered into each chatbot three times, generating a total of 240 responses. These responses were evaluated using the Global Quality Score (GQS) on a 5-point Likert scale (5: strongly agree; 4: agree; 3: neutral; 2: disagree; 1: strongly disagree). Any disagreements in scoring were resolved through evidence-based discussions. The validity of the responses was determined by categorizing them as valid or invalid based on two thresholds: a low threshold (scores of ≥ 4 for all three responses) and a high threshold (scores of 5 for all three responses). A chi-squared test was used to compare the validity of the responses between the chatbots. Cronbach's alpha was calculated to assess the reliability by evaluating the consistency of repeated responses from each chatbot.

Conclusion: The results indicate that the Claude AI chatbot demonstrated superior validity and reliability compared to ChatGPT and Google Gemini, whereas Bing was found to be less reliable. These findings underscore the need for authorities to establish strict guidelines to ensure the accuracy of medical information provided by AI chatbots.

Download full-text PDF

Source
http://dx.doi.org/10.1111/edt.13000DOI Listing

Publication Analysis

Top Keywords

validity reliability
12
reliability chatbots
8
chatgpt google
8
google gemini
8
responses
8
responses chatbots
8
validity responses
8
threshold scores
8
three responses
8
chatbots
5

Similar Publications

Background: Food addiction has been increasingly recognized as a contributing factor to obesity and eating disorders. Compulsive eating, characterized by an uncontrollable urge to consume food despite adverse consequences, shares behavioral similarities with substance addiction. This study aims to adapt the Brief Measure of Eating Compulsivity (MEC) into Turkish and evaluate its validity and reliability in the adolescent population.

View Article and Find Full Text PDF

Background: Sarcomas are rare cancer with a heterogeneous group of tumors. They affect both genders across all age groups and present significant heterogeneity, with more than 70 histological subtypes. Despite tailored treatments, the high metastatic potential of sarcomas remains a major factor in poor patient survival, as metastasis is often the leading cause of death.

View Article and Find Full Text PDF

Background: Ecological momentary assessment (EMA) is increasingly being incorporated into intervention studies to acquire a more fine-grained and ecologically valid assessment of change. The added utility of including relatively burdensome EMA measures in a clinical trial hinges on several psychometric assumptions, including that these measure are (1) reliable, (2) related to but not redundant with conventional self-report measures (convergent and discriminant validity), (3) sensitive to intervention-related change, and (4) associated with a clinically relevant criterion of improvement (criterion validity) above conventional self-report measures (incremental validity).

Objective: This study aimed to evaluate the reliability, validity, and sensitivity to change of conventional self-report versus EMA measures of rumination improvement.

View Article and Find Full Text PDF

Translation, Adaptation, and Validation of the Young Spine Questionnaire for the Italian Children.

Pediatr Phys Ther

September 2025

Department of Medicine and Health Science, University of Trieste, 34100 Trieste, Italy (Dr Policastro and Goos); Institute for Maternal and Child Health IRCCS Burlo Garofolo, 34137 Trieste, Italy (Casalaz and Sartori); Departmental Faculty of Medicine and Surgery, Saint Camillus International Univer

Purpose: Low back and neck pain are increasing worldwide, even in children. However, Italy lacks validated tools for the assessment of children and adolescents with spine disorders. The Young Spine Questionnaire (YSQ) seems to be an appropriate option.

View Article and Find Full Text PDF

This study demonstrates the successful fabrication of nanostructured Langmuir-Blodgett (LB) films combining the conjugated copolymer poly(9,9-dioctylfluorene--3,4-ethylenedioxythiophene) (PDOF--PEDOT) with spherical and triangular silver nanoparticles (AgNP). The LB technique allowed precise control over the molecular arrangement and distribution of the nanoparticles at the air-water interface, resulting in compact, reproducible and structurally ordered nanocomposite films. The structural and morphological properties of the interfacial monolayers and LB films were investigated using surface pressure-area isotherms, Brewster angle microscopy, polarization modulation infrared reflection-absorption spectroscopy (PM-IRRAS) and quartz crystal microbalance.

View Article and Find Full Text PDF