A Language Vision Model Approach for Automated Tumor Contouring in Radiation Oncology.

Bioengineering (Basel)

Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University, Baltimore, MD 21287, USA.

Published: July 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Lung cancer ranks as the leading cause of cancer-related mortality worldwide. The complexity of tumor delineation, crucial for radiation therapy, requires expertise often unavailable in resource-limited settings. Artificial Intelligence (AI), particularly with advancements in deep learning (DL) and natural language processing (NLP), offers potential solutions yet is challenged by high false positive rates. The Oncology Contouring Copilot (OCC) system is developed to leverage oncologist expertise for precise tumor contouring using textual descriptions, aiming to increase the efficiency of oncological workflows by combining the strengths of AI with human oversight. Our OCC system initially identifies nodule candidates from CT scans. Employing Language Vision Models (LVMs) like GPT-4V, OCC then effectively reduces false positives with clinical descriptive texts, merging textual and visual data to automate tumor delineation, designed to elevate the quality of oncology care by incorporating knowledge from experienced domain experts. The deployment of the OCC system resulted in a 35.0% reduction in the false discovery rate, a 72.4% decrease in false positives per scan, and an F1-score of 0.652 across our dataset for unbiased evaluation. OCC represents a significant advance in oncology care, particularly through the use of the latest LVMs, improving contouring results by (1) streamlining oncology treatment workflows by optimizing tumor delineation and reducing manual processes; (2) offering a scalable and intuitive framework to reduce false positives in radiotherapy planning using LVMs; (3) introducing novel medical language vision prompt techniques to minimize LVM hallucinations with ablation study; and (4) conducting a comparative analysis of LVMs, highlighting their potential in addressing medical language vision challenges.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12383427PMC
http://dx.doi.org/10.3390/bioengineering12080835DOI Listing

Publication Analysis

Top Keywords

language vision
16
tumor delineation
12
occ system
12
false positives
12
tumor contouring
8
oncology care
8
medical language
8
language
5
tumor
5
oncology
5

Similar Publications

Introduction: Vision language models (VLMs) combine image analysis capabilities with large language models (LLMs). Because of their multimodal capabilities, VLMs offer a clinical advantage over image classification models for the diagnosis of optic disc swelling by allowing a consideration of clinical context. In this study, we compare the performance of non-specialty-trained VLMs with different prompts in the classification of optic disc swelling on fundus photographs.

View Article and Find Full Text PDF

Computer vision has been identified as one of the solutions to bridge communication barriers between speech-impaired populations and those without impairment as most people are unaware of the sign language used by speech-impaired individuals. Numerous studies have been conducted to address this challenge. However, recognizing word signs, which are usually dynamic and involve more than one frame per sign, remains a challenge.

View Article and Find Full Text PDF

Temporal modeling plays an important role in the effective adaption of the powerful pretrained text-image foundation model into text-video retrieval. However, existing methods often rely on additional heavy trainable modules, such as transformer or BiLSTM, which are inefficient. In contrast, we avoid introducing such heavy components by leveraging frozen foundation models.

View Article and Find Full Text PDF

Generative AI tools in reflective essays: Moderating moral injuries and epistemic injustices.

S Afr Fam Pract (2004)

August 2025

School of Public Health, Faculty of Health Sciences, University of Cape Town, Cape Town.

The emergence of large language models such as ChatGPT is already influencing health care delivery, research and training for the next cohort of health care professionals. In a consumer-driven market, their capabilities to generate new forms of knowing and doing for experts and novices present both promises and threats to the livelihood of patients. This article explores burdens imposed by the use of generative artificial intelligence tools in reflective essays submitted by a fifth of first-year health sciences students.

View Article and Find Full Text PDF

The convergence of artificial intelligence (AI) and wearable biosensors is revolutionizing personalized healthcare, enabling continuous monitoring, early detection of health issues, which enhances the efficiency of data processing and real-time decision-making. Multimodal Large Language Models (MLLMs) play a pivotal role in this ecosystem by offering advanced capabilities in analyzing complex health data, understanding nuanced health contexts, and generating tailored health recommendations instantaneously. This study provides insights into how machine learning, deep learning algorithms, and MLLM can work together to facilitate the analysis of physiologic data for real-time monitoring and early warning systems as well as complex decision support mechanisms.

View Article and Find Full Text PDF