Publications by authors named "Felix Busch"

Objectives: To evaluate the potential of LLMs to generate sequence-level brain MRI protocols.

Materials And Methods: This retrospective study employed a dataset of 150 brain MRI cases derived from local imaging request forms. Reference protocols were established by two neuroradiologists.

View Article and Find Full Text PDF

Purpose To develop and validate MRSegmentator, a retrospective cross-modality deep learning model for multiorgan segmentation of MRI scans. Materials and Methods This retrospective study trained MRSegmentator on 1,200 manually annotated UK Biobank Dixon MRI sequences (50 participants), 221 in-house abdominal MRI sequences (177 patients), and 1228 CT scans from the TotalSegmentator-CT dataset. A human-in-the-loop annotation workflow leveraged cross-modality transfer learning from an existing CT segmentation model to segment 40 anatomic structures.

View Article and Find Full Text PDF

Unlabelled: Recent advancements in large language models (LLMs) offer potential benefits in healthcare, particularly in processing extensive patient records. However, existing benchmarks do not fully assess LLMs' capability in handling real-world, lengthy clinical data. We present the benchmark, comprising 20 detailed fictional patient cases across various diseases, with each case containing 5090 to 6754 words.

View Article and Find Full Text PDF

Efficient processing of radiology reports for monitoring disease progression is crucial in oncology. Although large language models (LLMs) show promise in extracting structured information from medical reports, privacy concerns limit their clinical implementation. This study evaluates the feasibility and accuracy of two of the most recent Llama models for generating structured lymphoma progression reports from cross-sectional imaging data in a privacy-preserving, real-world clinical setting.

View Article and Find Full Text PDF

Rationale And Objectives: Large Language Models (LLMs) show promise for generating patient-friendly radiology reports, but the performance of open-source versus proprietary LLMs needs assessment. To compare open-source and proprietary LLMs in generating patient-friendly radiology reports from chest CTs using quantitative readability metrics and qualitative assessments by radiologists.

Materials And Methods: Fifty chest CT reports were processed by seven LLMs: three open-source models (Llama-3-70b, Mistral-7b, Mixtral-8x7b) and four proprietary models (GPT-4, GPT-3.

View Article and Find Full Text PDF

Background & Aims: The rapid advancement of large language models (LLMs) has generated interest in their potential integration in clinical workflows. However, their effectiveness in interpreting complex (imaging) reports remains underexplored and has at times yielded suboptimal results. This study aims to assess the capability of state-of-the-art LLMs to classify liver lesions based solely on textual descriptions from MRI reports, challenging the models to interpret nuanced medical language and diagnostic criteria.

View Article and Find Full Text PDF

Importance: The successful implementation of artificial intelligence (AI) in health care depends on its acceptance by key stakeholders, particularly patients, who are the primary beneficiaries of AI-driven outcomes.

Objectives: To survey hospital patients to investigate their trust, concerns, and preferences toward the use of AI in health care and diagnostics and to assess the sociodemographic factors associated with patient attitudes.

Design, Setting, And Participants: This cross-sectional study developed and implemented an anonymous quantitative survey between February 1 and November 1, 2023, using a nonprobability sample at 74 hospitals in 43 countries.

View Article and Find Full Text PDF

Background: Chronic back pain (CBP) affects over 80 million people in Europe, contributing to substantial healthcare costs and disability. Understanding modifiable risk factors, such as muscle composition, may aid in prevention and treatment. This study investigates the association between lean muscle mass (LMM) and intermuscular adipose tissue (InterMAT) with CBP using noninvasive whole-body magnetic resonance imaging (MRI).

View Article and Find Full Text PDF

The integration of large language models (LLMs) into health care offers tremendous opportunities to improve medical practice and patient care. Besides being susceptible to biases and threats common to all artificial intelligence (AI) systems, LLMs pose unique cybersecurity risks that must be carefully evaluated before these AI models are deployed in health care. LLMs can be exploited in several ways, such as malicious attacks, privacy breaches, and unauthorized manipulation of patient data.

View Article and Find Full Text PDF

Accurate medical decision-making is critical for both patients and clinicians. Patients often struggle to interpret their symptoms, determine their severity, and select the right specialist. Simultaneously, clinicians face challenges in integrating complex patient data to make timely, accurate diagnoses.

View Article and Find Full Text PDF

Objectives: Large language models (LLMs) have shown potential in biomedical applications, leading to efforts to fine-tune them on domain-specific data. However, the effectiveness of this approach remains unclear. This study aims to critically evaluate the performance of biomedically fine-tuned LLMs against their general-purpose counterparts across a range of clinical tasks.

View Article and Find Full Text PDF

This study aims to investigate the feasibility, usability, and effectiveness of a Retrieval-Augmented Generation (RAG)-powered Patient Information Assistant (PIA) chatbot for pre-CT information counseling compared to the standard physician consultation and informed consent process. This prospective comparative study included 86 patients scheduled for CT imaging between November and December 2024. Patients were randomly assigned to either the PIA group (n = 43), who received pre-CT information via the PIA chat app, or the control group (n = 43), with standard doctor-led consultation.

View Article and Find Full Text PDF

Background: The introduction of large language models (LLMs) into clinical practice promises to improve patient education and empowerment, thereby personalizing medical care and broadening access to medical knowledge. Despite the popularity of LLMs, there is a significant gap in systematized information on their use in patient care. Therefore, this systematic review aims to synthesize current applications and limitations of LLMs in patient care.

View Article and Find Full Text PDF

Rationale And Objectives: Training Convolutional Neural Networks (CNN) requires large datasets with labeled data, which can be very labor-intensive to prepare. Radiology reports contain a lot of potentially useful information for such tasks. However, they are often unstructured and cannot be directly used for training.

View Article and Find Full Text PDF

Background High-quality translations of radiology reports are essential for optimal patient care. Because of limited availability of human translators with medical expertise, large language models (LLMs) are a promising solution, but their ability to translate radiology reports remains largely unexplored. Purpose To evaluate the accuracy and quality of various LLMs in translating radiology reports across high-resource languages (English, Italian, French, German, and Chinese) and low-resource languages (Swedish, Turkish, Russian, Greek, and Thai).

View Article and Find Full Text PDF

Autonomous Medical Evaluation for Guideline Adherence (AMEGA) is a comprehensive benchmark designed to evaluate large language models' adherence to medical guidelines across 20 diagnostic scenarios spanning 13 specialties. It includes an evaluation framework and methodology to assess models' capabilities in medical reasoning, differential diagnosis, treatment planning, and guideline adherence, using open-ended questions that mirror real-world clinical interactions. It includes 135 questions and 1337 weighted scoring elements designed to assess comprehensive medical knowledge.

View Article and Find Full Text PDF

Purpose: Large language models (LLMs) promise to streamline radiology reporting. With the release of OpenAI's GPT-4o (Generative Pre-trained Transformers-4 omni), which processes not only text but also speech, multimodal LLMs might now also be used as medical speech recognition software for radiology reporting in multiple languages. This proof-of-concept study investigates the feasibility of using GPT-4o for automated voice-to-text transcription of radiology reports in English and German.

View Article and Find Full Text PDF
Article Synopsis
  • Advances in large language models (LLMs) have led to numerous commercial and open-source models, but there has been no real-world comparison of OpenAI's GPT-4 against these models for extracting information from radiology reports.
  • The study aimed to compare GPT-4 with several leading open-source LLMs in extracting relevant findings from chest radiograph reports using datasets from the ImaGenome and Massachusetts General Hospital.
  • Results showed that GPT-4 slightly outperformed the best open-source model, Llama 2-70B, in terms of accuracy scores, with both showing strong performance in extracting findings from the reports.
View Article and Find Full Text PDF

Structured reporting (SR) has long been a goal in radiology to standardize and improve the quality of radiology reports. Despite evidence that SR reduces errors, enhances comprehensiveness, and increases adherence to guidelines, its widespread adoption has been limited. Recently, large language models (LLMs) have emerged as a promising solution to automate and facilitate SR.

View Article and Find Full Text PDF

Purpose: To quantitatively and qualitatively evaluate and compare the performance of leading large language models (LLMs), including proprietary models (GPT-4, GPT-3.5 Turbo, Claude-3-Opus, and Gemini Ultra) and open-source models (Mistral-7b and Mistral-8×7b), in simplifying 109 interventional radiology reports.

Methods: Qualitative performance was assessed using a five-point Likert scale for accuracy, completeness, clarity, clinical relevance, naturalness, and error rates, including trust-breaking and post-therapy misconduct errors.

View Article and Find Full Text PDF

Background: The successful integration of artificial intelligence (AI) in healthcare depends on the global perspectives of all stakeholders. This study aims to answer the research question: What are the attitudes of medical, dental, and veterinary students towards AI in education and practice, and what are the regional differences in these perceptions?

Methods: An anonymous online survey was developed based on a literature review and expert panel discussions. The survey assessed students' AI knowledge, attitudes towards AI in healthcare, current state of AI education, and preferences for AI teaching.

View Article and Find Full Text PDF
Article Synopsis
  • The EU's AI Act is the first detailed legal framework focused on artificial intelligence, particularly impacting healthcare.
  • Existing regulations like the Medical Device Regulation do not specifically address medical AI applications, making the AI Act crucial for this sector.
  • The commentary highlights key elements of the AI Act, providing clear references to specific chapters for better understanding.
View Article and Find Full Text PDF