PathLDM: Text conditioned Latent Diffusion Model for Histopathology.

Srikar Yellapragada , Alexandros Graikos , Prateek Prasanna , Tahsin Kurc , Joel Saltz , Dimitris Samaras

IEEE Winter Conf Appl Comput Vis

Stony Brook University.

Published: January 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

To achieve high-quality results, diffusion models must be trained on large datasets. This can be notably prohibitive for models in specialized domains, such as computational pathology. Conditioning on labeled data is known to help in data-efficient model training. Therefore, histopathology reports, which are rich in valuable clinical information, are an ideal choice as guidance for a histopathology generative model. In this paper, we introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images. Leveraging the rich contextual information provided by pathology text reports, our approach fuses image and textual data to enhance the generation process. By utilizing GPT's capabilities to distill and summarize complex text reports, we establish an effective conditioning mechanism. Through strategic conditioning and necessary architectural enhancements, we achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11131586	PMC
http://dx.doi.org/10.1109/wacv57701.2024.00510	DOI Listing

Publication Analysis

Top Keywords

latent diffusion

diffusion model

text reports

pathldm text

text conditioned

conditioned latent

model

histopathology

model histopathology

histopathology achieve

Similar Publications

DISCO: A DIFFUSION MODEL FOR SPATIAL TRANSCRIPTOMICS DATA COMPLETION.

Proc Int Conf Image Proc

September 2025

University of California, Irvine.

Ziheng Duan , Xi Li , Zhuoyang Zhang , James Song , Jing Zhang

Spatial transcriptomics enables the study of gene expression within the spatial context of tissues, offering valuable insights into tissue organization and function. However, technical limitations can result in large missing regions of data, which hinder accurate downstream analyses and biological interpretation. To address these challenges, we propose (ffusion model for patial transcriptomics data mpletion), a framework with three key features.

View Article and Find Full Text PDF

Similar Publications

Ethnic Identity Profiles Among Adolescents in the ABCD Study: Associations with Resting State Functional Connectivity and Perceived Discrimination.

bioRxiv

August 2025

FIU Embrace Center for Advancing Inclusive Communities, Florida International University, Miami, FL, USA.

Taylor R Jancetic , Micaela Lembo , Chloe L Hampson , Donisha D Smith , Julio A Peraza

Ethnic identity refers to how individuals perceive and experience themselves in the context of social groups, racial background, or culture (Phinney & Ong., 2007). Ethnic identity is positively associated with psychological well-being (Rivas-Drake et al.

View Article and Find Full Text PDF

Similar Publications

Mitigating spectral bias in neural operators via high-frequency scaling for physical systems.

Neural Netw

August 2025

Division of Applied Mathematics, Brown University, Providence, RI, 02912, USA; Pacific Northwest National Laboratory, Richland, WA, 99354, USA. Electronic address:

Siavash Khodakarami , Vivek Oommen , Aniruddha Bora , George Em Karniadakis

Neural operators have emerged as powerful surrogates for modeling complex physical problems. However, they suffer from spectral bias making them oblivious to high-frequency modes, which are present in multiscale physical systems. Therefore, they tend to produce over-smoothed solutions, which is particularly problematic in modeling turbulence and for systems with intricate patterns and sharp gradients such as multi-phase flow systems.

View Article and Find Full Text PDF

Similar Publications

DiffRaman: A conditional latent denoising diffusion probabilistic model for enhancing bacterial identification via Raman spectra generation under limited data.

Anal Chim Acta

October 2025

State Key Laboratory of Precision Measurement Technology and Instruments, Tsinghua University, Beijing, 100084, China. Electronic address:

Haiming Yao , Wei Luo , Ang Gao , Tao Zhou , Xue Wang

Raman spectroscopy has attracted significant attention in various biochemical detection fields, especially in the rapid identification of pathogenic bacteria. The integration of this technology with deep learning to facilitate automated bacterial Raman spectroscopy diagnosis has emerged as a key focus in recent research. However, the diagnostic performance of existing deep learning methods largely depends on a sufficient dataset, and in scenarios where there is a limited availability of Raman spectroscopy data, it is inadequate to fully optimize the numerous parameters of deep neural networks.

View Article and Find Full Text PDF

Similar Publications

Towards Generic Abdominal Multi-Organ Segmentation with multiple partially labeled datasets.

Comput Med Imaging Graph

September 2025

Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, China.

Xiang Li , Faming Fang , Liyan Ma , Tieyong Zeng , Guixu Zhang

An increasing number of publicly available datasets have facilitated the exploration of building universal medical segmentation models. Existing approaches address partially labeled problem of each dataset by harmonizing labels across datasets and independently focusing on the labeled foreground regions. However, significant challenges persist, particularly in the form of cross-site domain shifts and the limited utilization of partially labeled datasets.

View Article and Find Full Text PDF

Similar Publications