Both machine learning and mechanistic modelling approaches have been used independently with great success in systems biology. Machine learning excels in deriving statistical relationships and quantitative prediction from data, while mechanistic modelling is a powerful approach to capture knowledge and infer causal mechanisms underpinning biological phenomena. Importantly, the strengths of one are the weaknesses of the other, which suggests that substantial gains can be made by combining machine learning with mechanistic modelling, a field referred to as Scientific Machine Learning (SciML).
View Article and Find Full Text PDFRice tillering is an important agronomic trait regulated by plant genetic and environmental factors. However, the role and mechanism of the root microbiota in modulating rice tillering have not been explored. Here, we examined the root microbiota composition and tiller numbers of 182 genome-sequenced rice varieties grown under field conditions and uncovered a significant correlation between root microbiota composition and rice tiller number.
View Article and Find Full Text PDFPenalized factorial regression offers a computationally attractive alternative to kernel and deep learning methods for prediction of genotype by environment interactions. For two representative data sets on wheat and maize, prediction accuracies were comparable, while computing requirements and time were clearly lower. A longstanding challenge in plant breeding and genetics is the prediction of yield for new environments in the presence of genotype by environment interaction ( ).
View Article and Find Full Text PDFSci Total Environ
March 2025
Certain tree species can reach ages of centuries, whereas lifespan of species like apple are markedly shorter. The latter is caused by negative plant-soil feedback that results in microbiome changes. We hypothesized that tree species with a long lifespan will be able to avoid such negative feedback and their root-associated microbiomes will be similar in trees of different ages.
View Article and Find Full Text PDFBMC Bioinformatics
February 2025
Background: Protein large language models (LLM) have been used to extract representations of enzyme sequences to predict their function, which is encoded by enzyme commission (EC) numbers. However, a comprehensive comparison of different LLMs for this task is still lacking, leaving questions about their relative performance. Moreover, protein sequence alignments (e.
View Article and Find Full Text PDFNucleic Acids Res
November 2024
Many plant transcription factors (TFs) are multifunctional and regulate growth and development in more than one tissue. These TFs can generally associate with different protein partners depending on the tissue type, thereby regulating tissue-specific target gene sets. However, how interaction specificity is ensured is still largely unclear.
View Article and Find Full Text PDFComput Struct Biotechnol J
December 2024
Protein engineering increasingly relies on machine learning models to computationally pre-screen promising novel candidates. Although machine learning approaches have proven effective, their performance on prospective screening data leaves room for improvement; prediction accuracy can vary greatly from one protein variant to the next. So far, it is unclear what characterizes variants that are associated with large prediction error.
View Article and Find Full Text PDFGenes of the family PHOSPHATIDYLETHANOLAMINE-BINDING PROTEINS (PEBP) have been intensely studied in plants for their role in cell (re)programming and meristem differentiation. Recently, sporadic reports of the presence of a new type of PEBP in plants became available, highly similar to the YY-PEBPs of prokaryotes. A comprehensive investigation of their spread, origin, and function revealed conservation across the plant kingdom.
View Article and Find Full Text PDFGenome Biol
August 2024
Background: Polyploidy is widely recognized as a significant evolutionary force in the plant kingdom, contributing to the diversification of plants. One of the notable features of allopolyploidy is the occurrence of homoeologous exchange (HE) events between the subgenomes, causing changes in genomic composition, gene expression, and phenotypic variations. However, the role of HE in plant adaptation and domestication remains unclear.
View Article and Find Full Text PDFIncreasing natural resistance and resilience in plants is key for ensuring food security within a changing climate. Breeders improve these traits by crossing cultivars with their wild relatives and introgressing specific alleles through meiotic recombination. However, some genomic regions are devoid of recombination especially in crosses between divergent genomes, limiting the combinations of desirable alleles.
View Article and Find Full Text PDFSalinity stress constrains lateral root (LR) growth and severely affects plant growth. Auxin signaling regulates LR formation, but the molecular mechanism by which salinity affects root auxin signaling and whether salt induces other pathways that regulate LR development remains unknown. In Arabidopsis thaliana, the auxin-regulated transcription factor LATERAL ORGAN BOUNDARY DOMAIN 16 (LBD16) is an essential player in LR development under control conditions.
View Article and Find Full Text PDFThe Arabidopsis thaliana transcription factor BRANCHED1 (BRC1) plays a pivotal role in the control of shoot branching as it integrates environmental and endogenous signals that influence axillary bud growth. Despite its remarkable activity as a growth inhibitor, the mechanisms by which BRC1 promotes bud dormancy are largely unknown. We determined the genome-wide BRC1 binding sites in vivo and combined these with transcriptomic data and gene co-expression analyses to identify bona fide BRC1 direct targets.
View Article and Find Full Text PDFPhytoplasmas are pathogenic bacteria that reprogram plant host development for their own benefit. Previous studies have characterized a few different phytoplasma effector proteins that destabilize specific plant transcription factors. However, these are only a small fraction of the potential effectors used by phytoplasmas; therefore, the molecular mechanisms through which phytoplasmas modulate their hosts require further investigation.
View Article and Find Full Text PDFMol Biol Evol
September 2023
Polyploidy is recurrent across the tree of life and known as an evolutionary driving force in plant diversification and crop domestication. How polyploid plants adapt to various habitats has been a fundamental question that remained largely unanswered. Brassica napus is a major crop cultivated worldwide, resulting from allopolyploidy between unknown accessions of diploid B.
View Article and Find Full Text PDFMany studies have demonstrated the utility of machine learning (ML) methods for genomic prediction (GP) of various plant traits, but a clear rationale for choosing ML over conventionally used, often simpler parametric methods, is still lacking. Predictive performance of GP models might depend on a plethora of factors including sample size, number of markers, population structure and genetic architecture. Here, we investigate which problem and dataset characteristics are related to good performance of ML methods for genomic prediction.
View Article and Find Full Text PDFRecent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input.
View Article and Find Full Text PDFA highly specialized function for individual LTPs for different products from the same terpenoid biosynthesis pathway is described and the function of an LTP GPI anchor is studied. Sequiterpenes produced in glandular trichomes of the medicinal plant Tanacetum parthenium (feverfew) accumulate in the subcuticular extracellular space. Transport of these compounds over the plasma membrane is presumably by specialized membrane transporters, but it is still not clear how these hydrophobic compounds are subsequently transported over the hydrophilic cell wall.
View Article and Find Full Text PDFPhytochemistry
November 2022
Plant monoterpenes are challenging compounds, since they often act as solvents, and thus have both phytotoxic and antimicrobial properties. In this study an approach is developed to identify and characterize enzymes that can detoxify monoterpenoids, and thus would protect both plants and microbial production systems from these compounds. Plants respond to the presence of monoterpenes by expressing glycosyltransferases (UGTs), which conjugate the monoterpenoids into glycosides.
View Article and Find Full Text PDFStrigolactones (SLs) are rhizosphere signalling molecules and phytohormones. The biosynthetic pathway of SLs in tomato has been partially elucidated, but the structural diversity in tomato SLs predicts that additional biosynthetic steps are required. Here, root RNA-seq data and co-expression analysis were used for SL biosynthetic gene discovery.
View Article and Find Full Text PDFMeiotic recombination is a biological process of key importance in breeding, to generate genetic diversity and develop novel or agronomically relevant haplotypes. In crop tomato, recombination is curtailed as manifested by linkage disequilibrium decay over a longer distance and reduced diversity compared with wild relatives. Here, we compared domesticated and wild populations of tomato and found an overall conserved recombination landscape, with local changes in effective recombination rate in specific genomic regions.
View Article and Find Full Text PDFSesquiterpene synthases (STSs) catalyze the formation of a large class of plant volatiles called sesquiterpenes. While thousands of putative STS sequences from diverse plant species are available, only a small number of them have been functionally characterized. Sequence identity-based screening for desired enzymes, often used in biotechnological applications, is difficult to apply here as STS sequence similarity is strongly affected by species.
View Article and Find Full Text PDFCucumis melo (melon or muskmelon) is an important crop in the family of the Cucurbitaceae. Melon is cross pollinated and domesticated at several locations throughout the breeding history, resulting in highly diverse genetic structure in the germplasm. Yet, the relations among the groups and cultivars are still incomplete.
View Article and Find Full Text PDFPrediction of growth-related complex traits is highly important for crop breeding. Photosynthesis efficiency and biomass are direct indicators of overall plant performance and therefore even minor improvements in these traits can result in significant breeding gains. Crop breeding for complex traits has been revolutionized by technological developments in genomics and phenomics.
View Article and Find Full Text PDFBioinformatics
December 2020
Motivation: As the number of experimentally solved protein structures rises, it becomes increasingly appealing to use structural information for predictive tasks involving proteins. Due to the large variation in protein sizes, folds and topologies, an attractive approach is to embed protein structures into fixed-length vectors, which can be used in machine learning algorithms aimed at predicting and understanding functional and physical properties. Many existing embedding approaches are alignment based, which is both time-consuming and ineffective for distantly related proteins.
View Article and Find Full Text PDF