The digitization of histology slides has revolutionized pathology, providing massive datasets for cancer diagnosis and research. Contrastive self-supervised and vision-language models have been shown to effectively mine large pathology datasets to learn discriminative representations. On the other hand, generative models, capable of synthesizing realistic and diverse images, present a compelling solution to address unique problems in pathology that involve synthesizing images; overcoming annotated data scarcity, enabling privacy-preserving data sharing, and performing inherently generative tasks, such as virtual staining.
View Article and Find Full Text PDFWe developed a deep learning Pathomics image analysis workflow to generate spatial Tumor-TIL maps to visualize and quantify the abundance and spatial distribution of tumor infiltrating lymphocytes (TILs) in colon cancer. Colon cancer and lymphocyte detection in hematoxylin and eosin (H&E) stained whole slide images (WSIs) has revealed complex immuno-oncologic interactions that form TIL-rich and TIL-poor tumor habitats, which are unique in each patient sample. We compute Tumor%, total lymphocyte%, and TILs% as the proportion of the colon cancer microenvironment occupied by intratumoral lymphocytes for each WSI.
View Article and Find Full Text PDFObjective: This project demonstrates the feasibility of connecting medical imaging data and features, SARS-CoV-2 genome variants, with clinical data in the National Clinical Cohort Collaborative (N3C) repository to accelerate integrative research on detection, diagnosis, and treatment of COVID-19-related morbidities. The N3C curated a rich collection of aggregated and de-identified electronic health records (EHR) data of over 18 million patients, including 7.5 million COVID-positive patients, seen at hospitals across the United States.
View Article and Find Full Text PDFIEEE Open J Eng Med Biol
September 2024
In the medical diagnostics domain, pathology and histology are pivotal for the precise identification of diseases. Digital histopathology, enhanced by automation, facilitates the efficient analysis of massive amount of biopsy images produced on a daily basis, streamlining the evaluation process. This study focuses in Stain Color Normalization (SCN) within a Whole-Slide Image (WSI) cohort, aiming to reduce batch biases.
View Article and Find Full Text PDFConf Comput Vis Pattern Recognit Workshops
June 2024
Estimating uncertainty of a neural network is crucial in providing transparency and trustworthiness. In this paper, we focus on uncertainty estimation for digital pathology prediction models. To explore the large amount of unlabeled data in digital pathology, we propose to adopt novel learning method that can fully exploit unlabeled data.
View Article and Find Full Text PDFProc Mach Learn Res
July 2023
Multiplex Immunohistochemistry (mIHC) is a cost-effective and accessible method for in situ labeling of multiple protein biomarkers in a tissue sample. By assigning a different stain to each biomarker, it allows the visualization of different types of cells within the tumor vicinity for downstream analysis. However, to detect different types of stains in a given mIHC image is a challenging problem, especially when the number of stains is high.
View Article and Find Full Text PDFIEEE Winter Conf Appl Comput Vis
January 2024
To achieve high-quality results, diffusion models must be trained on large datasets. This can be notably prohibitive for models in specialized domains, such as computational pathology. Conditioning on labeled data is known to help in data-efficient model training.
View Article and Find Full Text PDFProc IEEE Comput Soc Conf Comput Vis Pattern Recognit
June 2023
In digital pathology, the spatial context of cells is important for cell classification, cancer diagnosis and prognosis. To model such complex cell context, however, is challenging. Cells form different mixtures, lineages, clusters and holes.
View Article and Find Full Text PDFBackground: The immune microenvironment impacts tumor growth, invasion, metastasis, and patient survival and may provide opportunities for therapeutic intervention in pancreatic ductal adenocarcinoma (PDAC). Although never studied as a potential modulator of the immune response in most cancers, Keratin 17 (K17), a biomarker of the most aggressive (basal) molecular subtype of PDAC, is intimately involved in the histogenesis of the immune response in psoriasis, basal cell carcinoma, and cervical squamous cell carcinoma. Thus, we hypothesized that K17 expression could also impact the immune cell response in PDAC, and that uncovering this relationship could provide insight to guide the development of immunotherapeutic opportunities to extend patient survival.
View Article and Find Full Text PDFIncreasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics.
View Article and Find Full Text PDFNat Methods
February 2024
Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers.
View Article and Find Full Text PDFNPJ Precis Oncol
January 2024
Digital pathology has seen a proliferation of deep learning models in recent years, but many models are not readily reusable. To address this challenge, we developed WSInfer: an open-source software ecosystem designed to streamline the sharing and reuse of deep learning models for digital pathology. The increased access to trained models can augment research on the diagnostic, prognostic, and predictive capabilities of digital pathology.
View Article and Find Full Text PDFComput Methods Programs Biomed
September 2023
Background And Objective: Histopathology is the gold standard for diagnosis of many cancers. Recent advances in computer vision, specifically deep learning, have facilitated the analysis of histopathology images for many tasks, including the detection of immune cells and microsatellite instability. However, it remains difficult to identify optimal models and training configurations for different histopathology classification tasks due to the abundance of available architectures and the lack of systematic evaluations.
View Article and Find Full Text PDFValidation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers.
View Article and Find Full Text PDFBioinformatics
April 2023
Motivation: Deep learning attained excellent results in digital pathology recently. A challenge with its use is that high quality, representative training datasets are required to build robust models. Data annotation in the domain is labor intensive and demands substantial time commitment from expert pathologists.
View Article and Find Full Text PDFIEEE Open J Eng Med Biol
January 2023
Histopathologic evaluation of Hematoxylin & Eosin (H&E) stained slides is essential for disease diagnosis, revealing tissue morphology, structure, and cellular composition. Variations in staining protocols and equipment result in images with color nonconformity. Although pathologists compensate for color variations, these disparities introduce inaccuracies in computational whole slide image (WSI) analysis, accentuating data domain shift and degrading generalization.
View Article and Find Full Text PDFBackground: Deep learning methods have demonstrated remarkable performance in pathology image analysis, but they are computationally very demanding. The aim of our study is to reduce their computational cost to enable their use with large tissue image datasets.
Methods: We propose a method called Network Auto-Reduction (NAR) that simplifies a Convolutional Neural Network (CNN) by reducing the network to minimize the computational cost of doing a prediction.
Bioinformatics
July 2022
Motivation: Whole slide tissue images contain detailed data on the sub-cellular structure of cancer. Quantitative analyses of this data can lead to novel biomarkers for better cancer diagnosis and prognosis and can improve our understanding of cancer mechanisms. Such analyses are challenging to execute because of the sizes and complexity of whole slide image data and relatively limited volume of training data for machine learning methods.
View Article and Find Full Text PDFTumor-infiltrating lymphocytes (TILs) have been established as a robust prognostic biomarker in breast cancer, with emerging utility in predicting treatment response in the adjuvant and neoadjuvant settings. In this study, the role of TILs in predicting overall survival and progression-free interval was evaluated in two independent cohorts of breast cancer from the Cancer Genome Atlas (TCGA BRCA) and the Carolina Breast Cancer Study (UNC CBCS). We utilized machine learning and computer vision algorithms to characterize TIL infiltrates in digital whole-slide images (WSIs) of breast cancer stained with hematoxylin and eosin (H&E).
View Article and Find Full Text PDFComput Methods Programs Biomed
June 2022
Background And Objective: Deep learning methods have demonstrated remarkable performance in pathology image analysis, but they require a large amount of annotated training data from expert pathologists. The aim of this study is to minimize the data annotation need in these analyses.
Methods: Active learning (AL) is an iterative approach to training deep learning models.
The role of tumor infiltrating lymphocytes (TILs) as a biomarker to predict disease progression and clinical outcomes has generated tremendous interest in translational cancer research. We present an updated and enhanced deep learning workflow to classify 50x50 um tiled image patches (100x100 pixels at 20x magnification) as TIL positive or negative based on the presence of 2 or more TILs in gigapixel whole slide images (WSIs) from the Cancer Genome Atlas (TCGA). This workflow generates TIL maps to study the abundance and spatial distribution of TILs in 23 different types of cancer.
View Article and Find Full Text PDF