Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

To achieve high-quality results, diffusion models must be trained on large datasets. This can be notably prohibitive for models in specialized domains, such as computational pathology. Conditioning on labeled data is known to help in data-efficient model training. Therefore, histopathology reports, which are rich in valuable clinical information, are an ideal choice as guidance for a histopathology generative model. In this paper, we introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images. Leveraging the rich contextual information provided by pathology text reports, our approach fuses image and textual data to enhance the generation process. By utilizing GPT's capabilities to distill and summarize complex text reports, we establish an effective conditioning mechanism. Through strategic conditioning and necessary architectural enhancements, we achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11131586PMC
http://dx.doi.org/10.1109/wacv57701.2024.00510DOI Listing

Publication Analysis

Top Keywords

latent diffusion
8
diffusion model
8
text reports
8
pathldm text
4
text conditioned
4
conditioned latent
4
model
4
histopathology
4
model histopathology
4
histopathology achieve
4

Similar Publications

Spatial transcriptomics enables the study of gene expression within the spatial context of tissues, offering valuable insights into tissue organization and function. However, technical limitations can result in large missing regions of data, which hinder accurate downstream analyses and biological interpretation. To address these challenges, we propose (ffusion model for patial transcriptomics data mpletion), a framework with three key features.

View Article and Find Full Text PDF

Ethnic identity refers to how individuals perceive and experience themselves in the context of social groups, racial background, or culture (Phinney & Ong., 2007). Ethnic identity is positively associated with psychological well-being (Rivas-Drake et al.

View Article and Find Full Text PDF

Mitigating spectral bias in neural operators via high-frequency scaling for physical systems.

Neural Netw

August 2025

Division of Applied Mathematics, Brown University, Providence, RI, 02912, USA; Pacific Northwest National Laboratory, Richland, WA, 99354, USA. Electronic address:

Neural operators have emerged as powerful surrogates for modeling complex physical problems. However, they suffer from spectral bias making them oblivious to high-frequency modes, which are present in multiscale physical systems. Therefore, they tend to produce over-smoothed solutions, which is particularly problematic in modeling turbulence and for systems with intricate patterns and sharp gradients such as multi-phase flow systems.

View Article and Find Full Text PDF

DiffRaman: A conditional latent denoising diffusion probabilistic model for enhancing bacterial identification via Raman spectra generation under limited data.

Anal Chim Acta

October 2025

State Key Laboratory of Precision Measurement Technology and Instruments, Tsinghua University, Beijing, 100084, China. Electronic address:

Raman spectroscopy has attracted significant attention in various biochemical detection fields, especially in the rapid identification of pathogenic bacteria. The integration of this technology with deep learning to facilitate automated bacterial Raman spectroscopy diagnosis has emerged as a key focus in recent research. However, the diagnostic performance of existing deep learning methods largely depends on a sufficient dataset, and in scenarios where there is a limited availability of Raman spectroscopy data, it is inadequate to fully optimize the numerous parameters of deep neural networks.

View Article and Find Full Text PDF

Towards Generic Abdominal Multi-Organ Segmentation with multiple partially labeled datasets.

Comput Med Imaging Graph

September 2025

Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, China.

An increasing number of publicly available datasets have facilitated the exploration of building universal medical segmentation models. Existing approaches address partially labeled problem of each dataset by harmonizing labels across datasets and independently focusing on the labeled foreground regions. However, significant challenges persist, particularly in the form of cross-site domain shifts and the limited utilization of partially labeled datasets.

View Article and Find Full Text PDF