Publications by Michael A Pfeffer

Importance: Limited qualitative studies exist evaluating ambient artificial intelligence (AI) scribe tools. Such studies can provide deeper insights into ambient AI implementations by capturing lived experiences.

Objective: To evaluate physician perspectives on ambient AI scribes.

View Article and Find Full Text PDF

Clinical trials informed framework for real world clinical implementation and deployment of artificial intelligence applications.

Jacqueline G You , Tina Hernandez-Boussard , Michael A Pfeffer , Adam Landman , Rebecca G Mishuris

NPJ Digit Med

February 2025

With rapidly evolving artificial intelligence solutions, healthcare organizations need an implementation roadmap. A "clinical trials" informed approach can promote safe and impactful implementation of artificial intelligence. This framework includes four phases: (1) Safety; (2) Efficacy; (3) Effectiveness and comparison to an existing standard; and (4) Monitoring.

View Article and Find Full Text PDF

Development of secure infrastructure for advancing generative artificial intelligence research in healthcare at an academic medical center.

Madelena Y Ng , Jarrod Helzer , Michael A Pfeffer , Tina Seto , Tina Hernandez-Boussard

J Am Med Inform Assoc

March 2025

Background: Generative AI, particularly large language models (LLMs), holds great potential for improving patient care and operational efficiency in healthcare. However, the use of LLMs is complicated by regulatory concerns around data security and patient privacy. This study aimed to develop and evaluate a secure infrastructure that allows researchers to safely leverage LLMs in healthcare while ensuring HIPAA compliance and promoting equitable AI.

View Article and Find Full Text PDF

Ambient artificial intelligence scribes: utilization and impact on documentation time.

Stephen P Ma , April S Liang , Shreya J Shah , Margaret Smith , Yejin Jeong , Michael A Pfeffer

J Am Med Inform Assoc

February 2025

Objectives: To quantify utilization and impact on documentation time of a large language model-powered ambient artificial intelligence (AI) scribe.

Materials And Methods: This prospective quality improvement study was conducted at a large academic medical center with 45 physicians from 8 ambulatory disciplines over 3 months. Utilization and documentation times were derived from electronic health record (EHR) use measures.

View Article and Find Full Text PDF

Ambient artificial intelligence scribes: physician burnout and perspectives on usability and documentation burden.

Shreya J Shah , Anna Devon-Sand , Stephen P Ma , Yejin Jeong , Trevor Crowell , Michael A Pfeffer

J Am Med Inform Assoc

February 2025

Objective: This study evaluates the pilot implementation of ambient AI scribe technology to assess physician perspectives on usability and the impact on physician burden and burnout.

Materials And Methods: This prospective quality improvement study was conducted at Stanford Health Care with 48 physicians over a 3-month period. Outcome measures included burden, burnout, usability, and perceived time savings.

View Article and Find Full Text PDF

Individual and community socioeconomic status and receipt of influenza vaccines among adult primary care patients in a large academic health system: 2017-2019.

Sae Takada , Un Young Chung , Philippe Bourgois , O Kenrik Duru , Lillian Gelberg , Michael A Pfeffer

Heliyon

December 2024

Introduction: Influenza causes significant mortality and morbidity in the U.S., yet less than half of adults receive influenza vaccination.

View Article and Find Full Text PDF

Perspectives on Artificial Intelligence-Generated Responses to Patient Messages.

Jiyeong Kim , Michael L Chen , Shawheen J Rezaei , April S Liang , Susan M Seav , Michael A Pfeffer

JAMA Netw Open

October 2024

View Article and Find Full Text PDF

Testing and Evaluation of Health Care Applications of Large Language Models: A Systematic Review.

Suhana Bedi , Yutong Liu , Lucy Orr-Ewing , Dev Dash , Sanmi Koyejo , Michael A Pfeffer

JAMA

January 2025

Importance: Large language models (LLMs) can assist in various health care activities, but current evaluation approaches may not adequately identify the most useful application areas.

Objective: To summarize existing evaluations of LLMs in health care in terms of 5 components: (1) evaluation data type, (2) health care task, (3) natural language processing (NLP) and natural language understanding (NLU) tasks, (4) dimension of evaluation, and (5) medical specialty.

Data Sources: A systematic search of PubMed and Web of Science was performed for studies published between January 1, 2022, and February 19, 2024.

View Article and Find Full Text PDF

Development of Secure Infrastructure for Advancing Generative AI Research in Healthcare at an Academic Medical Center.

Madelena Y Ng , Jarrod Helzer , Michael A Pfeffer , Tina Seto , Tina Hernandez-Boussard

Res Sq

September 2024

The increasing interest in leveraging generative AI models in healthcare necessitates secure infrastructure at academic medical centers. Without an all-encompassing secure system, researchers may create their own insecure microprocesses, risking the exposure of protected health information (PHI) to the public internet or its inadvertent incorporation into AI model training. To address these challenges, our institution implemented a secure pathway to the Azure OpenAI Service using our own private OpenAI instance which we fully control to facilitate high-throughput, secure LLM queries.

View Article and Find Full Text PDF

The Need for Continuous Evaluation of Artificial Intelligence Prediction Algorithms.

Nigam H Shah , Michael A Pfeffer , Marzyeh Ghassemi

JAMA Netw Open

September 2024

View Article and Find Full Text PDF

Artificial Intelligence-Generated Draft Replies to Patient Inbox Messages.

Patricia Garcia , Stephen P Ma , Shreya Shah , Margaret Smith , Yejin Jeong , Michael A Pfeffer

JAMA Netw Open

March 2024

Importance: The emergence and promise of generative artificial intelligence (AI) represent a turning point for health care. Rigorous evaluation of generative AI deployment in clinical practice is needed to inform strategic decision-making.

Objective: To evaluate the implementation of a large language model used to draft responses to patient messages in the electronic inbox.

View Article and Find Full Text PDF

Factor XI Inhibition for the Prevention of Catheter-Associated Thrombosis in Patients With Cancer Undergoing Central Line Placement: A Phase 2 Clinical Trial.

Michael A Pfeffer , Tia C L Kohs , Helen H Vu , Kelley R Jordan , Jenny Si Han Wang

Arterioscler Thromb Vasc Biol

January 2024

Background: Despite the ubiquitous utilization of central venous catheters in clinical practice, their use commonly provokes thromboembolism. No prophylactic strategy has shown sufficient efficacy to justify routine use. Coagulation factors FXI (factor XI) and FXII (factor XII) represent novel targets for device-associated thrombosis, which may mitigate bleeding risk.

View Article and Find Full Text PDF

Balancing Innovation and Cybersecurity in Medical Schools and Their Related Academic Health Systems.

Michael Halaas , Michael A Pfeffer , Laura Weiss Roberts

Acad Med

November 2023

View Article and Find Full Text PDF

Evaluating the predictive ability of natural language processing in identifying tertiary/quaternary cases in prioritization workflows for interhospital transfer.

Timothy Lee , Paul J Lukac , Sitaram Vangala , Kamran Kowsari , Vu Vu , Michael A Pfeffer

JAMIA Open

October 2023

Objectives: Tertiary and quaternary (TQ) care refers to complex cases requiring highly specialized health services. Our study aimed to compare the ability of a natural language processing (NLP) model to an existing human workflow in predictively identifying TQ cases for transfer requests to an academic health center.

Materials And Methods: Data on interhospital transfers were queried from the electronic health record for the 6-month period from July 1, 2020 to December 31, 2020.

View Article and Find Full Text PDF

Creation and Adoption of Large Language Models in Medicine.

Nigam H Shah , David Entwistle , Michael A Pfeffer

JAMA

September 2023

Importance: There is increased interest in and potential benefits from using large language models (LLMs) in medicine. However, by simply wondering how the LLMs and the applications powered by them will reshape medicine instead of getting actively involved, the agency in shaping how these tools can be used in medicine is lost.

Observations: Applications powered by LLMs are increasingly used to perform medical tasks without the underlying language model being trained on medical records and without verifying their purported benefit in performing those tasks.

View Article and Find Full Text PDF

The Stanford Medicine data science ecosystem for clinical and translational research.

Alison Callahan , Euan Ashley , Somalee Datta , Priyamvada Desai , Todd A Ferris , Michael A Pfeffer

JAMIA Open

October 2023

Objective: To describe the infrastructure, tools, and services developed at Stanford Medicine to maintain its data science ecosystem and research patient data repository for clinical and translational research.

Materials And Methods: The data science ecosystem, dubbed the Stanford Data Science Resources (SDSR), includes infrastructure and tools to create, search, retrieve, and analyze patient data, as well as services for data deidentification, linkage, and processing to extract high-value information from healthcare IT systems. Data are made available via self-service and concierge access, on HIPAA compliant secure computing infrastructure supported by in-depth user training.

View Article and Find Full Text PDF

The shaky foundations of large language models and foundation models for electronic health records.

Michael Wornow , Yizhe Xu , Rahul Thapa , Birju Patel , Ethan Steinberg , Michael A Pfeffer

NPJ Digit Med

July 2023

The success of foundation models such as ChatGPT and AlphaFold has spurred significant interest in building similar models for electronic medical records (EMRs) to improve patient care and hospital operations. However, recent hype has obscured critical gaps in our understanding of these models' capabilities. In this narrative review, we examine 84 foundation models trained on non-imaging EMR data (i.

View Article and Find Full Text PDF

Psychological toxicity in classical hematology.

Michael A Pfeffer , Kylee Martens , Thomas Kartika , Hannah McMurry , Sven Olson

Eur J Haematol

October 2023

Although considered "benign," mild blood count abnormalities, genetic factors imparting inconsequential thrombotic risk, and low-risk premalignant blood disorders can have significant psychological and financial impact on our patients. Several studies have demonstrated that patients with noncancerous conditions have increased levels of anxiety with distress similar to those with malignancy. Additionally, referral to a classical hematologist can be a daunting process for many patients due to uncertainties surrounding the reason for referral or misconstrued beliefs in a cancer diagnosis ascribed to the pairing of oncology and hematology in medical practice.

View Article and Find Full Text PDF

Considerations in the reliability and fairness audits of predictive models for advance care planning.

Jonathan Lu , Amelia Sattler , Samantha Wang , Ali Raza Khaki , Alison Callahan , Michael A Pfeffer

Front Digit Health

September 2022

Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration.

View Article and Find Full Text PDF

Assessment of Adherence to Reporting Guidelines by Commonly Used Clinical Prediction Models From a Single Vendor: A Systematic Review.

Jonathan H Lu , Alison Callahan , Birju S Patel , Keith E Morse , Dev Dash , Michael A Pfeffer

JAMA Netw Open

August 2022

Importance: Various model reporting guidelines have been proposed to ensure clinical prediction models are reliable and fair. However, no consensus exists about which model details are essential to report, and commonalities and differences among reporting guidelines have not been characterized. Furthermore, how well documentation of deployed models adheres to these guidelines has not been studied.

View Article and Find Full Text PDF

Lower Severe Acute Respiratory Syndrome Coronavirus 2 Viral Shedding Following Coronavirus Disease 2019 Vaccination Among Healthcare Workers in Los Angeles, California.

Paul C Adamson , Michael A Pfeffer , Valerie A Arboleda , Omai B Garner , Annabelle de St Maurice

Open Forum Infect Dis

November 2021

Among 880 healthcare workers with a positive severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) test, 264 (30.0%) infections were identified following receipt of at least 1 vaccine dose. Median SARS-CoV-2 cycle threshold values were highest among individuals receiving 2 vaccine doses, corresponding to lower viral shedding.

View Article and Find Full Text PDF

SARS-CoV-2 Infection after Vaccination in Health Care Workers in California.

Jocelyn Keehner , Lucy E Horton , Michael A Pfeffer , Christopher A Longhurst , Robert T Schooley

N Engl J Med

May 2021

View Article and Find Full Text PDF