98%
921
2 minutes
20
Optical Coherence Tomography has become a common imaging technique that enables a non-invasive and detailed visualization of the retina and allows for the identification of various diseases. Through the advancement of technology, the volume and complexity of OCT data have rendered manual analysis infeasible, creating the need for automated means of detection. This study investigates the ability of state-of-the-art object detection models, including the latest YOLO versions (from v8 to v12), YOLO-World, YOLOE, and RT-DETR, to accurately detect pathological biomarkers in two retinal OCT datasets. The AROI dataset focuses on fluid detection in Age-related Macular Degeneration, while the OCT5k dataset contains a wide range of retinal pathologies. The experiments performed show that YOLOv12 offers the best balance between detection accuracy and computational efficiency, while YOLOE manages to consistently outperform all other models across both datasets and most classes, particularly in detecting pathologies that cover a smaller area. This work provides a comprehensive benchmark of the capabilities of state-of-the-art object detection for medical applications, specifically for identifying retinal pathologies from OCT scans, offering insights and a starting point for the development of future automated solutions for analysis in a clinical setting.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12293458 | PMC |
http://dx.doi.org/10.3390/diagnostics15141823 | DOI Listing |
IEEE Trans Pattern Anal Mach Intell
September 2025
Generalized visual grounding tasks, including Generalized Referring Expression Comprehension (GREC) and Segmentation (GRES), extend the classical visual grounding paradigm by accommodating multi-target and non-target scenarios. Specifically, GREC focuses on accurately identifying all referential objects at the coarse bounding box level, while GRES aims for achieve fine-grained pixel-level perception. However, existing approaches typically treat these tasks independently, overlooking the benefits of jointly training GREC and GRES to ensure consistent multi-granularity predictions and streamline the overall process.
View Article and Find Full Text PDFIEEE Trans Image Process
September 2025
Camouflaged object detection (COD) aims to discover objects that are seamlessly embedded in the environment. Existing COD methods have made significant progress by typically representing features in a discrete way with arrays of pixels. However, limited by discrete representation, these methods need to align features of different scales during decoding, which causes some subtle discriminative clues to become blurred.
View Article and Find Full Text PDFInt J Comput Assist Radiol Surg
September 2025
School of Life and Environmental Sciences, Guilin University of Electronic Technology, Guilin, China.
Objective: Cataract surgery is among the most frequently performed procedures worldwide. Accurate, real-time segmentation of the cornea and surgical instruments is vital for intraoperative guidance and surgical education. However, most existing deep learning-based segmentation methods depend on pixel-level annotations, which are time-consuming and limit practical deployment.
View Article and Find Full Text PDFAnn Stat
February 2025
Department of Statistics, Pennsylvania State University.
Random objects are complex non-Euclidean data taking values in general metric spaces, possibly devoid of any underlying vector space structure. Such data are becoming increasingly abundant with the rapid advancement in technology. Examples include probability distributions, positive semidefinite matrices and data on Riemannian manifolds.
View Article and Find Full Text PDFNeural Netw
August 2025
Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing, Hunan Normal University, Changsha, 410081, China. Electronic address:
The key challenges in kidney tumor segmentation include unpredictable location, high similarity among objects, and variability in boundaries. Existing approaches mostly handle these challenges from an object-agnostic perspective or a single decoupling perspective, which limits their ability to address all the aforementioned challenges. To tackle these problems, we propose a Dual-perspective Decoupling Network (DDNet), which consists of the Dual-perspective Decoupling Module (DDM) and the Edge Refinement Module (ERM).
View Article and Find Full Text PDF