Finite mixtures of matrix variate Poisson-log normal distributions for three-way count data.

Anjali Silva , Xiaoke Qin , Steven J Rothstein , Paul D McNicholas , Sanjeena Subedi

Bioinformatics

School of Mathematics and Statistics, Carleton University, Ottawa, ON K1S 5B6, Canada.

Published: May 2023

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Motivation: Three-way data structures, characterized by three entities, the units, the variables and the occasions, are frequent in biological studies. In RNA sequencing, three-way data structures are obtained when high-throughput transcriptome sequencing data are collected for n genes across p conditions at r occasions. Matrix variate distributions offer a natural way to model three-way data and mixtures of matrix variate distributions can be used to cluster three-way data. Clustering of gene expression data is carried out as means of discovering gene co-expression networks.

Results: In this work, a mixture of matrix variate Poisson-log normal distributions is proposed for clustering read counts from RNA sequencing. By considering the matrix variate structure, full information on the conditions and occasions of the RNA sequencing dataset is simultaneously considered, and the number of covariance parameters to be estimated is reduced. We propose three different frameworks for parameter estimation: a Markov chain Monte Carlo-based approach, a variational Gaussian approximation-based approach, and a hybrid approach. Various information criteria are used for model selection. The models are applied to both real and simulated data, and we demonstrate that the proposed approaches can recover the underlying cluster structure in both cases. In simulation studies where the true model parameters are known, our proposed approach shows good parameter recovery.

Availability And Implementation: The GitHub R package for this work is available at https://github.com/anjalisilva/mixMVPLN and is released under the open source MIT license.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10159656	PMC
http://dx.doi.org/10.1093/bioinformatics/btad167	DOI Listing

Publication Analysis

Top Keywords

matrix variate

three-way data

rna sequencing

mixtures matrix

variate poisson-log

poisson-log normal

normal distributions

data

data structures

conditions occasions

Similar Publications

Mode-wise principal subspace pursuit and matrix spiked covariance model.

J R Stat Soc Series B Stat Methodol

February 2025

Departments of Biostatistics & Bioinformatics and Computer Science, Duke University, Durham, NC, USA.

Runshi Tang , Ming Yuan , Anru R Zhang

This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection.

View Article and Find Full Text PDF

Similar Publications

Process evaluation of quality of precancerous cervical lesion screening program in selected public health centers in Addis Ababa, Ethiopia.

J Cancer Policy

March 2025

Institute of Health, Jimma University, Jimma, Ethiopia.

Mikael Abraham , Tilahun Fufa , Asrat Arja , Yesuneh Tefera Mekasha , Gemmechu Hasen

Cervical cancer is the second most prevalent disease among Ethiopian women of reproductive age and a serious gynecological malignancy affecting women regionally. About, 3235 deaths and 4648 new cases are reported nationwide each year. Precancerous cervical screening programs face many difficulties in settings with limited resources, despite their severity, such as a lack of medical supplies and equipment, poorly trained healthcare workers, a heavy workload for current staff, low professional compliance, and insufficient support from medical facilities.

View Article and Find Full Text PDF

Similar Publications

Bayesian thresholded modeling for integrating brain node and network predictors.

Biostatistics

December 2024

Department of Biostatistics, Yale University, 300 George St, New Haven, CT 06511, United States.

Zhe Sun , Wanwan Xu , Tianxi Li , Jian Kang , Gregorio Alanis-Lobato

Progress in neuroscience has provided unprecedented opportunities to advance our understanding of brain alterations and their correspondence to phenotypic profiles. With data collected from various imaging techniques, studies have integrated different types of information ranging from brain structure, function, or metabolism. More recently, an emerging way to categorize imaging traits is through a metric hierarchy, including localized node-level measurements and interactive network-level metrics.

View Article and Find Full Text PDF

Similar Publications

[Geographical origin authentication of Gongju at different spatial scales based on hyperspectral technology].

Zhongguo Zhong Yao Za Zhi

November 2024

National Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences Beijing 100700, China.

Xue Guo , Rui-Bin Bai , Hui Wang , Wei-Wen Li , Ling Dong

Gongju(Chrysanthemum morifolium) is one of the five major medicinal Chrysanthemum varieties included in the Chinese Pharmacopoeia. In recent years, its cultivation areas have changed significantly, resulting in mixed quality of the medicinal herbs. In this study, Gongju cultivated in Anhui, Yunnan, Chongqing, and other places were selected as research objects.

View Article and Find Full Text PDF

Similar Publications

Distributional uniformity quantification in heterogeneous prepared dishes combined the hyperspectral imaging technology with Moran's I: A case study of pizza.

Food Chem

February 2025

Agricultural Product Processing and Storage Lab, School of Food and Biological Engineering, Jiangsu University, Zhenjiang, Jiangsu 212013, China; International Joint Research Laboratory of Intelligent Agriculture and Agri-products Processing (Jiangsu University), Jiangsu Education Department, Zhenji

Peipei Gao , Wenlong Li , Sulafa B H Hashim , Jing Liang , Jialong Xu

Quality detection is critical in the development of prepared dishes, with distributional uniformity playing a significant role. This study used hyperspectral imaging (HSI) and Moran's I to quantify distributional uniformity, employing pizza as case. Pizza ingredients' spectra were collected, pre-processed with Detrended Fluctuation Analysis (DFA), Savitzky-Golay (SG) and Standard Normal Variate (SNV), and down-scaled with Principal Component Analysis (PCA).

View Article and Find Full Text PDF

Similar Publications