Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Generative artificial intelligence (AI) systems are increasingly deployed in clinical pharmacy; yet, systematic evaluation of their efficacy, limitations, and risks across diverse practice scenarios remains limited.

Objective: This study aims to quantitatively evaluate and compare the performance of 8 mainstream generative AI systems across 4 core clinical pharmacy scenarios-medication consultation, medication education, prescription review, and case analysis with pharmaceutical care-using a multidimensional framework.

Methods: Forty-eight clinically validated questions were selected via stratified sampling from real-world sources (eg, hospital consultations, clinical case banks, and national pharmacist training databases). Three researchers simultaneously tested 8 different generative AI systems (ERNIE Bot, Doubao, Kimi, Qwen, GPT-4o, Gemini-1.5-Pro, Claude-3.5-Sonnet, and DeepSeek-R1) using standardized prompts within a single day (February 20, 2025). A double-blind scoring design was used, with 6 experienced clinical pharmacists (≥5 years experience) evaluating the AI responses across 6 dimensions: accuracy, rigor, applicability, logical coherence, conciseness, and universality, scored 0-10 per predefined criteria (eg, -3 for inaccuracy and -2 for incomplete rigor). Statistical analysis used one-way ANOVA with Tukey Honestly Significant Difference (HSD) post hoc testing and intraclass correlation coefficients (ICC) for interrater reliability (2-way random model). Qualitative thematic analysis identified recurrent errors and limitations.

Results: DeepSeek-R1 (DeepSeek) achieved the highest overall performance (mean composite score: medication consultation 9.4, SD 1.0; case analysis 9.3, SD 1.0), significantly outperforming others in complex tasks (P<.05). Critical limitations were observed across models, including high-risk decision errors-75% omitted critical contraindications (eg, ethambutol in optic neuritis) and a lack of localization-90% erroneously recommended macrolides for drug-resistant Mycoplasma pneumoniae (China's high-resistance setting), while only DeepSeek-R1 aligned with updated American Academy of Pediatrics (AAP) guidelines for pediatric doxycycline. Complex reasoning deficits: only Claude-3.5-Sonnet detected a gender-diagnosis contradiction (prostatic hyperplasia in female); no model identified diazepam's 7-day prescription limit. Interrater consistency was lowest for conciseness in case analysis (ICC=0.70), reflecting evaluator disagreement on complex outputs. ERNIE Bot (Baidu) consistently underperformed (case analysis: 6.8, SD 1.5; P<.001 vs DeepSeek-R1).

Conclusions: While generative AI shows promise as a pharmacist assistance tool, significant limitations-including high-risk errors (eg, contraindication omissions), inadequate localization, and complex reasoning gaps-preclude autonomous clinical decision-making. Performance stratification highlights DeepSeek-R1's current advantage, but all systems require optimization in dynamic knowledge updating, complex scenario reasoning, and output interpretability. Future deployment must prioritize human oversight (human-AI co-review), ethical safeguards, and continuous evaluation frameworks.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288765PMC
http://dx.doi.org/10.2196/76128DOI Listing

Publication Analysis

Top Keywords

clinical pharmacy
12
generative artificial
8
artificial intelligence
8
intelligence systems
8
generative systems
8
case analysis
8
clinical
5
comparative analysis
4
generative
4
analysis generative
4

Similar Publications

Background: Amrubicin monotherapy has been used in Japan for patients with refractory, relapsed, small cell lung cancer (SCLC). However, the clinical guidelines do not specify a recommended initial dose for elderly patients. This retrospective study aimed to explore the appropriate initial dose of amrubicin for elderly patients with refractory, relapsed SCLC.

View Article and Find Full Text PDF

The miniaturization of separation platforms marks a transformative shift in analytical science, merging microfabrication, automation, and intelligent data integration to meet rising demands for portability, sustainability, and precision. This review critically synthesizes recent technological advances reshaping the field-from microinjection and preconcentration modules to compact, high-sensitivity detection systems including ultraviolet-visible (UV/Vis), fluorescence (FL), electrochemical detection (ECD), and mass spectrometry (MS). The integration of microcontrollers, AI-enhanced calibration routines, and IoT-enabled feedback loops has led to the rise of self-regulating analytical devices capable of real-time decision-making and autonomous operation.

View Article and Find Full Text PDF

Canine Mdr1 Knockout MDCK Cells Reliably Estimate Human Small Intestinal Permeability () and Fraction Absorbed ().

Mol Pharm

September 2025

Johnson & Johnson, Translational PK/PD & Investigational Toxicology, Spring House, Pennsylvania 19002, United States.

Human intestinal permeability is a key determinant of the oral fraction absorbed () of active pharmaceutical ingredients (APIs). This study evaluated the ability of an in-house canine Mdr1 (cMdr1) knockout (KO) Madin-Darby Canine Kidney (MDCK) cell line to correlate apparent permeability () with human small intestinal permeability (). values of 16 reference compounds with high, medium, or low permeabilities were measured in the in-house cMdr1 KO MDCK protocol under pH gradient (6.

View Article and Find Full Text PDF

Background & Aims: Pregnancy can be a complex and risk-filled event for women with inflammatory bowel disease (IBD). High-quality studies in this population are lacking, with limited data on medications approved to treat IBD during pregnancy. For patients, limited knowledge surrounding pregnancy impacts pregnancy rates, medication adherence, and outcomes.

View Article and Find Full Text PDF

To optimize the deployment of Generative Artificial Intelligence in health care, it's essential for health care professionals (HCPs) to understand these technologies' capabilities and constraints. This study explores HCPs' initial impressions and experiences using ChatGPT, a Generative Pre-trained Transformer, in Pediatric Critical Care Units (PICUs). By conducting focus groups with a diverse set of HCPs, we aimed to assess their awareness, utilization, perceived benefits, and concerns about incorporating ChatGPT into their PICUs.

View Article and Find Full Text PDF