Synthetic data-augmented machine learning approaches for tailor-made microbial conversion of methane to phytoene.

Bioresour Technol

School of Environmental Engineering, University of Seoul, 163 Seoulsiripdae-ro, Dongdaemun-gu, Seoul 02504, Republic of Korea. Electronic address:

Published: December 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Metabolic engineering has become a critical tool for biosynthesizing valuable compounds, yet its progress is frequently constrained by labor-intensive, trial-and-error methods. Here, a machine learning (ML)-assisted predictive framework enhanced with synthetic data generation method was developed to systematically optimize the metabolic pathway responsible for biosynthesis of phytoene from methane in the non-model methanotroph, Methylocystis sp. MJC1. To effectively balance metabolic flux and maximize phytoene biosynthesis, three key genes (dxs, crtE, and crtB) involved in the methylerythritol 4-phosphate (MEP) and carotenoid pathways were targeted for modulation. These genes were expressed under promoters with systematically varied strengths, creating a diverse experimental dataset used to train ML models. ML algorithms, including deep neural networks (DNN) and support vector machines (SVM), predicted optimal promoter-gene combinations to maximize phytoene production. To overcome the inherent data limitations of working with non-model organisms, conditional tabular generative adversarial networks (CTGAN) were employed, effectively generating synthetic data to enhance DNN prediction accuracy. Experimental validation confirmed that the ML-guided engineered strain exhibited a 2.2-fold improvement in phytoene production and a 1.5-fold increase in content compared to the base strain, clearly demonstrating successful pathway optimization. This study showcases the effectiveness of integrating ML-driven predictive frameworks with metabolic engineering approaches, enabling rapid, efficient, and precise optimization of microbial bioconversion processes utilizing methane as a sustainable feedstock.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.biortech.2025.133160DOI Listing

Publication Analysis

Top Keywords

machine learning
8
metabolic engineering
8
synthetic data
8
maximize phytoene
8
phytoene production
8
phytoene
5
synthetic data-augmented
4
data-augmented machine
4
learning approaches
4
approaches tailor-made
4

Similar Publications

Background: A clear understanding of minimal clinically important difference (MCID) and substantial clinical benefit (SCB) is essential for effectively implementing patient-reported outcome measurements (PROMs) as a performance measure for total knee arthroplasty (TKA). Since not achieving MCID and SCB may reflect suboptimal surgical benefit, the primary aim of this study was to use machine learning to predict patients who may not achieve the threshold-based outcomes (i.e.

View Article and Find Full Text PDF

Arthroplasty surgery is a common and successful end-stage intervention for advanced osteoarthritis. Yet, postoperative outcomes vary significantly among patients, leading to a plethora of measures and associated measurement approaches to monitor patient outcomes. Traditional approaches rely heavily on patient-reported outcome measures (PROMs), which are widely used, but often lack sensitivity to detect function changes (e.

View Article and Find Full Text PDF

Automatic markerless estimation of infant posture and motion from ordinary videos carries great potential for movement studies "in the wild", facilitating understanding of motor development and massively increasing the chances of early diagnosis of disorders. There has been a rapid development of human pose estimation methods in computer vision, thanks to advances in deep learning and machine learning. However, these methods are trained on datasets that feature adults in different contexts.

View Article and Find Full Text PDF

This study aims to investigate the predictive value of combined phenotypic age and phenotypic age acceleration (PhenoAgeAccel) for benign prostatic hyperplasia (BPH) and develop a machine learning-based risk prediction model to inform precision prevention and clinical management strategies. The study analyzed data from 784 male participants in the US National Health and Nutrition Examination Survey (NHANES, 2001-2008). Phenotypic age was derived from chronological age and nine serum biomarkers.

View Article and Find Full Text PDF

Bariatric surgery is an effective treatment for morbid obesity, but patient outcomes differ greatly because of a variety of phenotypes, comorbidities, and postoperative adherence. In bariatric care, artificial intelligence (AI) and machine learning (ML) are becoming revolutionary tools because traditional predictive models based on BMI and demographic variables are unable to account for these complexities. To put it simply, AI is a branch of computer science that enables machines to perform tasks that typically require human intelligence.

View Article and Find Full Text PDF