Feature selection with vector-symbolic architectures: a case study on microbial profiles of shotgun metagenomic samples of colorectal cancer.

Brief Bioinform

Center for Computational Life Sciences, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, Cleveland, OH 44195, United States.

Published: March 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Unlabelled: The continuously decreasing cost of next-generation sequencing has recently led to a significant increase in the number of microbiome-related studies, providing invaluable information for understanding host-microbiome interactions and their relation to diseases. A common approach in metagenomics consists of determining the composition of samples in terms of the amount and types of microbial species that populate them, with the goal of identifying microbes whose profiles are able to differentiate samples under different conditions with advanced feature selection techniques. Here, we propose a novel backward variable selection method based on the hyperdimensional computing (HDC) paradigm, which takes inspiration from how the human brain works in the classification of concepts by encoding features into vectors in a high-dimensional space. We validated our method on public metagenomic samples collected from patients affected by colorectal cancer in a case/control scenario, by performing a comparative analysis with other state-of-the-art feature selection methods, obtaining promising results.

Author Summary: Characterizing the microbial composition of metagenomic samples is crucial for identifying potential biomarkers that can distinguish between healthy and diseased states. However, the high dimensionality and complexity of metagenomic data present significant challenges in the context of accurately selecting features. Our backward variable selection method, based on the HDC paradigm, offers a promising approach to overcoming these challenges. By effectively reducing the feature space while preserving essential information, this method enhances the ability to detect critical microbial signatures associated with diseases like colorectal cancer, leading to more precise diagnostic tools.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12018301PMC
http://dx.doi.org/10.1093/bib/bbaf177DOI Listing

Publication Analysis

Top Keywords

feature selection
12
metagenomic samples
12
colorectal cancer
12
backward variable
8
variable selection
8
selection method
8
method based
8
hdc paradigm
8
samples
5
feature
4

Similar Publications

TMVR for the Treatment of Mitral Regurgitation: A State-of-the-Art Review.

Circ Cardiovasc Interv

September 2025

Department of Biomedical Sciences, Humanitas University, Pieve Emanuele-Milan, Italy (F.T., G.A., M.G., K.S., D.D., G.S., M.C.).

Mitral regurgitation is the most common valve disease worldwide. Despite its wide success in inoperable or high-risk surgical patients, transcatheter edge-to-edge repair remains limited by some anatomic features and the not negligible rate of significant residual regurgitation. Transcatheter mitral valve replacement has emerged as a viable alternative that promises to overcome these issues, but its development has been progressing slowly.

View Article and Find Full Text PDF

Background: Although splenomegaly is a common finding in Epstein-Barr virus (EBV) infection, splenic infarction is rarely reported and may be under-recognised, especially in adults. Neurological complications such as aseptic meningitis are also uncommon but documented. The simultaneous occurrence of both complications in the context of primary EBV infection is exceptional.

View Article and Find Full Text PDF

Purpose: Identifying radiomics features that help predict whether glioblastoma patients are prone to developing epilepsy may contribute to an improvement of preventive treatment and a better understanding of the underlying pathophysiology.

Materials And Methods: In this retrospective study, 3-T MRI data of 451 pretreatment glioblastoma patients (mean age: 61.2 ± 11.

View Article and Find Full Text PDF

In this contribution, Molecular Electron Density Theory (MEDT) is employed to investigate the (3 + 2) cycloaddition reaction between ()--methyl--(2-furyl)-nitrone 1 and but-2-ynedioic acid 2. DFT calculations at the M06-2X-D3/6-311+G(d,p) level of theory under solvent-free conditions at room temperature show that this reaction proceeds CA3-Z diastereoselectivity, with the formation of the CA3-Z cycloadduct being both thermodynamically and kinetically more favoured than the CA4-Z one. Reactivity parameters obtained from CDFT calculations reveal that compound 1 predominantly behaves as a nucleophile with moderate electrophilic features, in contrast to compound 2, which demonstrates strong electrophilicity and limited nucleophilic ability.

View Article and Find Full Text PDF

The bacterial DNA damage (SOS) response promotes DNA repair, DNA damage tolerance, and survival in the setting of genotoxic stress, including stress induced by antibiotics. In , translesion DNA synthesis can be fulfilled by Y-family DNA polymerases, including DNA polymerase IV (DinB). DinB features a more open active site and lacks proofreading ability, promoting error-prone replication.

View Article and Find Full Text PDF