Publications by authors named "Danielle S Bitterman"

Background: Master clinical trial protocol structures offer administrative, procedural and statistical advantages but have not been applied in assessing new radiotherapy devices. Herein, we report on a pooled analysis from a first-of-kind master trial evaluating stereotactic MRI-guided adaptive radiotherapy (SMART).

Methods: Subjects were enrolled on a prospective master protocol evaluating SMART for multiple oncologic indications.

View Article and Find Full Text PDF

Chronological age, although commonly used in clinical practice, fails to capture individual variations in rates of ageing and physiological decline. Recent advances in artificial intelligence (AI) have transformed the estimation of biological age using various imaging techniques. This Review consolidates AI developments in age prediction across brain, chest, abdominal, bone, and facial imaging using diverse methods, including MRI, CT, x-ray, and photographs.

View Article and Find Full Text PDF

Introduction: Patients receiving thoracic radiotherapy (RT) have an increased risk of major adverse cardiac events (MACE) posttreatment. We utilized machine learning (ML) to discover novel predictors of MACE and validated them on an external cohort.

Methods: This multi-institutional retrospective study included 984 patients [ = 803 non-small cell lung cancer (NSCLC),  = 181 breast cancer] treated with radiotherapy.

View Article and Find Full Text PDF

Purpose: To evaluate the performance and consistency of large language models (LLMs) across brand and generic oncology drug names in various clinical tasks, addressing concerns about potential fluctuations in LLM performance because of subtle phrasing differences that could affect patient care.

Methods: This study evaluated three LLMs (GPT-3.5-turbo-0125, GPT-4-turbo, and GPT-4o) using drug names from HemOnc ontology.

View Article and Find Full Text PDF

Large language models (LLMs) have demonstrated emergent human-like capabilities in natural language processing, leading to enthusiasm about their integration in healthcare environments. In oncology, where synthesising complex, multimodal data is essential, LLMs offer a promising avenue for supporting clinical decision-making, enhancing patient care, and accelerating research. This narrative review aims to highlight the current state of LLMs in medicine; applications of LLMs in oncology for clinicians, patients, and translational research; and future research directions.

View Article and Find Full Text PDF

Background: As humans age at different rates, physical appearance can yield insights into biological age and physiological health more reliably than chronological age. In medicine, however, appearance is incorporated into medical judgements in a subjective and non-standardised way. In this study, we aimed to develop and validate FaceAge, a deep learning system to estimate biological age from easily obtainable and low-cost face photographs.

View Article and Find Full Text PDF

Large language models (LLMs) exhibit a critical vulnerability arising from being trained to be helpful: a tendency to comply with illogical requests that would generate misinformation, even when they have the knowledge to identify the request as illogical. This study investigated this vulnerability in the medical domain, evaluating five frontier LLMs using prompts that misrepresent equivalent drug relationships. We tested baseline compliance, the impact of prompts allowing rejection and emphasizing factual recall, and the effects of fine-tuning on a dataset of illogical requests, including out-of-distribution generalization.

View Article and Find Full Text PDF

Background: Adequate patient awareness and understanding of cancer clinical trials is essential for trial recruitment, informed decision making, and protocol adherence. Although large language models (LLMs) have shown promise for patient education, their role in enhancing patient awareness of clinical trials remains unexplored. This study explored the performance and risks of LLMs in generating trial-specific educational content for potential participants.

View Article and Find Full Text PDF

Objective: Data extraction from the published literature is the most laborious step in conducting living systematic reviews (LSRs). We aim to build a generalizable, automated data extraction workflow leveraging large language models (LLMs) that mimics the real-world 2-reviewer process.

Materials And Methods: A dataset of 10 trials (22 publications) from a published LSR was used, focusing on 23 variables related to trial, population, and outcomes data.

View Article and Find Full Text PDF

The integration of large language models (LLMs) into electronic health records offers potential benefits but raises significant ethical, legal, and operational concerns, including unconsented data use, lack of governance, and AI-related malpractice accountability. Sycophancy, feedback loop bias, and data reuse risk amplifying errors without proper oversight. To safeguard patients, especially the vulnerable, clinicians must advocate for patient-centered education, ethical practices, and robust oversight to prevent harm.

View Article and Find Full Text PDF

Objective: To evaluate large language models (LLMs) for pre-test diagnostic probability estimation and compare their uncertainty estimation performance with a traditional machine learning classifier.

Materials And Methods: We assessed 2 instruction-tuned LLMs, Mistral-7B-Instruct and Llama3-70B-chat-hf, on predicting binary outcomes for Sepsis, Arrhythmia, and Congestive Heart Failure (CHF) using electronic health record (EHR) data from 660 patients. Three uncertainty estimation methods-Verbalized Confidence, Token Logits, and LLM Embedding+XGB-were compared against an eXtreme Gradient Boosting (XGB) classifier trained on raw EHR data.

View Article and Find Full Text PDF

Large language models (LLMs) are rapidly being adopted in healthcare, necessitating standardized reporting guidelines. We present transparent reporting of a multivariable model for individual prognosis or diagnosis (TRIPOD)-LLM, an extension of the TRIPOD + artificial intelligence statement, addressing the unique challenges of LLMs in biomedical applications. TRIPOD-LLM provides a comprehensive checklist of 19 main items and 50 subitems, covering key aspects from title to discussion.

View Article and Find Full Text PDF

Objectives: The application of natural language processing (NLP) in the clinical domain is important due to the rich unstructured information in clinical documents, which often remains inaccessible in structured data. When applying NLP methods to a certain domain, the role of benchmark datasets is crucial as benchmark datasets not only guide the selection of best-performing models but also enable the assessment of the reliability of the generated outputs. Despite the recent availability of language models capable of longer context, benchmark datasets targeting long clinical document classification tasks are absent.

View Article and Find Full Text PDF

The use of artificial intelligence (AI) holds great promise for radiation oncology, with many applications being reported in the literature, including some of which are already in clinical use. These are mainly in areas where AI provides benefits in efficiency (such as automatic segmentation and treatment planning). Prediction models that directly impact patient decision-making are far less mature in terms of their application in clinical practice.

View Article and Find Full Text PDF

Objective: Data extraction from the published literature is the most laborious step in conducting living systematic reviews (LSRs). We aim to build a generalizable, automated data extraction workflow leveraging large language models (LLMs) that mimics the real-world two-reviewer process.

Materials And Methods: A dataset of 10 clinical trials (22 publications) from a published LSR was used, focusing on 23 variables related to trial, population, and outcomes data.

View Article and Find Full Text PDF

Healthcare AI faces an ethical dilemma between selective and equitable deployment, exacerbated by flawed performance metrics. These metrics inadequately capture real-world complexities and biases, leading to premature assertions of effectiveness. Improved evaluation practices, including continuous monitoring and silent evaluation periods, are crucial.

View Article and Find Full Text PDF
Article Synopsis
  • TRIPOD-LLM is a new set of reporting guidelines specifically designed for the use of Large Language Models (LLMs) in biomedical research, aiming to standardize transparency and quality in healthcare applications.
  • The guidelines include a checklist with 19 main items and 50 subitems, adaptable to various research designs, emphasizing the importance of human oversight and task-specific performance.
  • An interactive website is provided to help researchers easily complete the guidelines and generate submissions, with the intention of continually updating the document as the field evolves.
View Article and Find Full Text PDF

This new editorial discusses the promise and challenges of successful integration of natural language processing methods into electronic health records for timely, robust, and fair oncology pharmacovigilance.

View Article and Find Full Text PDF

Objective: The application of Natural Language Processing (NLP) in the clinical domain is important due to the rich unstructured information in clinical documents, which often remains inaccessible in structured data. When applying NLP methods to a certain domain, the role of benchmark datasets is crucial as benchmark datasets not only guide the selection of best-performing models but also enable the assessment of the reliability of the generated outputs. Despite the recent availability of language models (LMs) capable of longer context, benchmark datasets targeting long clinical document classification tasks are absent.

View Article and Find Full Text PDF