Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Introduction: Vision language models (VLMs) combine image analysis capabilities with large language models (LLMs). Because of their multimodal capabilities, VLMs offer a clinical advantage over image classification models for the diagnosis of optic disc swelling by allowing a consideration of clinical context. In this study, we compare the performance of non-specialty-trained VLMs with different prompts in the classification of optic disc swelling on fundus photographs.

Methods: A diagnostic test accuracy study was conducted utilizing an open-sourced dataset. Five different prompts (increasing in context) were used with each of five different VLMs (Llama 3.2-vision, LLaVA-Med, LLaVA, GPT-4o, and DeepSeek-4V), resulting in 25 prompt-model pairs. The performance of VLMs in classifying photographs with and without optic disc swelling was measured using Youden's index (YI), F1 score, and accuracy rate.

Results: A total of 779 images of normal optic discs and 295 images of swollen discs were obtained from an open-source image database. Among the 25 prompt-model pairs, valid response rates ranged from 7.8% to 100% (median 93.6%). Diagnostic performance ranged from YI: 0.00 to 0.231 (median 0.042), F1 score: 0.00 to 0.716 (median 0.401), and accuracy rate: 27.5 to 70.5% (median 58.8%). The best-performing prompt-model pair was GPT-4o with role-playing with Chain-of-Thought and few-shot prompting. On average, Llama 3.2-vision performed the best (average YI across prompts 0.181). There was no consistent relationship between the amount of information given in the prompt and the model performance.

Conclusions: Non-specialty-trained VLMs could classify photographs of swollen and normal optic discs better than chance, with performance varying by model. Increasing prompt complexity did not consistently improve performance. Specialty-specific VLMs may be necessary to improve ophthalmic image analysis performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12415036PMC
http://dx.doi.org/10.3389/fdgth.2025.1660887DOI Listing

Publication Analysis

Top Keywords

optic disc
16
disc swelling
16
language models
12
vision language
8
image analysis
8
non-specialty-trained vlms
8
llama 32-vision
8
prompt-model pairs
8
normal optic
8
optic discs
8

Similar Publications

Neuroretinitis (NR) is characterised by optic disc oedema associated with macular exudates in a star-shaped pattern. Several aetiologies of NR have been described, with cat-scratch disease being the most common. However, despite thorough investigations, one-quarter of cases are classified as idiopathic neuroretinitis (INR), in which visual prognosis is generally good.

View Article and Find Full Text PDF

A 22-year-old woman had an 8-year history of progressive bilateral vision loss and of diabetes mellitus. Her mother had diabetes and two first cousins had severe congenital deafness. On examination, her visual acuities were 6/36 bilaterally, with absent colour vision and gross optic disc pallor.

View Article and Find Full Text PDF

Introduction: Vision language models (VLMs) combine image analysis capabilities with large language models (LLMs). Because of their multimodal capabilities, VLMs offer a clinical advantage over image classification models for the diagnosis of optic disc swelling by allowing a consideration of clinical context. In this study, we compare the performance of non-specialty-trained VLMs with different prompts in the classification of optic disc swelling on fundus photographs.

View Article and Find Full Text PDF

This systematic review examines the potential association between semaglutide, a glucagon-like peptide-1 (GLP-1) receptor agonist, and the development of non-arteritic anterior ischemic optic neuropathy (NAION). Nine studies were included, consisting of retrospective cohort analyses, case series, and pharmacovigilance reports. Findings across the literature were inconsistent, with some studies reporting an increased risk while others found no significant association.

View Article and Find Full Text PDF