Deciphering the proteome of K-12: Integrating transcriptomics and machine learning to annotate hypothetical proteins.

Comput Struct Biotechnol J

Institute for Biological Interfaces 5 (IBG-5), Biotechnology and Microbial Genetics, Karlsruhe Institute of Technology (KIT), Hermann-von-Helmholtz-Platz 1, Eggenstein-Leopoldshafen 76344, Germany.

Published: July 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Omics technologies have led to the discovery of a vast number of proteins that are expressed but have no functional annotation - so called hypothetical proteins (HPs). Even in the best-studied model organism K-12, over 2 % of the proteome remains uncharacterized. This knowledge gap becomes even worse when looking at microbial dark matter. However, knowing the functions of proteins is crucial for elucidating cellular and metabolic processes and harnessing biotechnological potentials. Here, we employed machine learning to decipher the transcriptional regulatory network of K-12, as well as other tools to assign functions to uncharacterized HPs. We further provide experimental validation of predicted functions for three HP-encoding genes (, and ) as proof of concept, by analyzing growth patterns of deletion mutants compared to the wild type, as well as their transcriptional responses to specific conditions. This study demonstrates that the use of Big Omics Data in combination with Artificial Intelligence and experimental controls is a powerful approach to illuminate functional dark matter.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12356324PMC
http://dx.doi.org/10.1016/j.csbj.2025.07.036DOI Listing

Publication Analysis

Top Keywords

machine learning
8
hypothetical proteins
8
dark matter
8
deciphering proteome
4
proteome k-12
4
k-12 integrating
4
integrating transcriptomics
4
transcriptomics machine
4
learning annotate
4
annotate hypothetical
4

Similar Publications

Background: A clear understanding of minimal clinically important difference (MCID) and substantial clinical benefit (SCB) is essential for effectively implementing patient-reported outcome measurements (PROMs) as a performance measure for total knee arthroplasty (TKA). Since not achieving MCID and SCB may reflect suboptimal surgical benefit, the primary aim of this study was to use machine learning to predict patients who may not achieve the threshold-based outcomes (i.e.

View Article and Find Full Text PDF

Arthroplasty surgery is a common and successful end-stage intervention for advanced osteoarthritis. Yet, postoperative outcomes vary significantly among patients, leading to a plethora of measures and associated measurement approaches to monitor patient outcomes. Traditional approaches rely heavily on patient-reported outcome measures (PROMs), which are widely used, but often lack sensitivity to detect function changes (e.

View Article and Find Full Text PDF

Automatic markerless estimation of infant posture and motion from ordinary videos carries great potential for movement studies "in the wild", facilitating understanding of motor development and massively increasing the chances of early diagnosis of disorders. There has been a rapid development of human pose estimation methods in computer vision, thanks to advances in deep learning and machine learning. However, these methods are trained on datasets that feature adults in different contexts.

View Article and Find Full Text PDF

This study aims to investigate the predictive value of combined phenotypic age and phenotypic age acceleration (PhenoAgeAccel) for benign prostatic hyperplasia (BPH) and develop a machine learning-based risk prediction model to inform precision prevention and clinical management strategies. The study analyzed data from 784 male participants in the US National Health and Nutrition Examination Survey (NHANES, 2001-2008). Phenotypic age was derived from chronological age and nine serum biomarkers.

View Article and Find Full Text PDF

Bariatric surgery is an effective treatment for morbid obesity, but patient outcomes differ greatly because of a variety of phenotypes, comorbidities, and postoperative adherence. In bariatric care, artificial intelligence (AI) and machine learning (ML) are becoming revolutionary tools because traditional predictive models based on BMI and demographic variables are unable to account for these complexities. To put it simply, AI is a branch of computer science that enables machines to perform tasks that typically require human intelligence.

View Article and Find Full Text PDF