Machine learning in RNA structure prediction: Advances and challenges.

Biophys J

Department of Physics and Institute of Data Science and Informatics, University of Missouri, Columbia, Missouri; Department of Biochemistry, University of Missouri, Columbia, Missouri. Electronic address:

Published: September 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

RNA molecules play a crucial role in various biological processes, with their functionality closely tied to their structures. The remarkable advancements in machine learning techniques for protein structure prediction have shown promise in the field of RNA structure prediction. In this perspective, we discuss the advances and challenges encountered in constructing machine learning-based models for RNA structure prediction. We explore topics including model building strategies, specific challenges involved in predicting RNA secondary (2D) and tertiary (3D) structures, and approaches to these challenges. In addition, we highlight the advantages and challenges of constructing RNA language models. Given the rapid advances of machine learning techniques, we anticipate that machine learning-based models will serve as important tools for predicting RNA structures, thereby enriching our understanding of RNA structures and their corresponding functions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11393687PMC
http://dx.doi.org/10.1016/j.bpj.2024.01.026DOI Listing

Publication Analysis

Top Keywords

structure prediction
16
machine learning
12
rna structure
12
rna
8
advances challenges
8
learning techniques
8
machine learning-based
8
learning-based models
8
predicting rna
8
rna structures
8

Similar Publications

Systematic analyses uncover plasma proteins linked to incident cardiovascular diseases.

Protein Cell

August 2025

Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China.

Cardiovascular disease (CVD) research is hindered by limited comprehensive analyses of plasma proteome across disease subtypes. Here, we systematically investigated the associations between plasma proteins and cardiovascular outcomes in 53,026 UK Biobank participants over a 14-year follow-up. Association analyses identified 3,089 significant associations involving 892 unique protein analytes across 13 CVD outcomes.

View Article and Find Full Text PDF

Maximizing theoretical and practical storage capacity in single-layer feedforward neural networks.

Front Comput Neurosci

August 2025

Department of Biomedical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, United States.

Artificial neural networks are limited in the number of patterns that they can store and accurately recall, with capacity constraints arising from factors such as network size, architectural structure, pattern sparsity, and pattern dissimilarity. Exceeding these limits leads to recall errors, eventually leading to catastrophic forgetting, which is a major challenge in continual learning. In this study, we characterize the theoretical maximum memory capacity of single-layer feedforward networks as a function of these parameters.

View Article and Find Full Text PDF

In this contribution, Molecular Electron Density Theory (MEDT) is employed to investigate the (3 + 2) cycloaddition reaction between ()--methyl--(2-furyl)-nitrone 1 and but-2-ynedioic acid 2. DFT calculations at the M06-2X-D3/6-311+G(d,p) level of theory under solvent-free conditions at room temperature show that this reaction proceeds CA3-Z diastereoselectivity, with the formation of the CA3-Z cycloadduct being both thermodynamically and kinetically more favoured than the CA4-Z one. Reactivity parameters obtained from CDFT calculations reveal that compound 1 predominantly behaves as a nucleophile with moderate electrophilic features, in contrast to compound 2, which demonstrates strong electrophilicity and limited nucleophilic ability.

View Article and Find Full Text PDF

Structure and function of the topsoil microbiome in Chinese terrestrial ecosystems.

Front Microbiol

August 2025

State Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences, Lanzhou, China.

While soil microorganisms underpin terrestrial ecosystem functioning, how their functional potential adapts across environmental gradients remains poorly understood, particularly for ubiquitous taxa. Employing a comprehensive metagenomic approach across China's six major terrestrial ecosystems (41 topsoil samples, 0-20 cm depth), we reveal a counterintuitive pattern: oligotrophic environments (deserts, karst) harbor microbiomes with significantly greater metabolic pathway diversity (KEGG) compared to resource-rich ecosystems. We provide a systematic catalog of key functional genes governing biogeochemical cycles in these soils, identifying: 6 core CAZyme genes essential for soil organic carbon (SOC) decomposition and biosynthesis; 62 nitrogen (N)-cycling genes (KOs) across seven critical enzymatic clusters; 15 sulfur (S)-cycling genes (KOs) within three key enzymatic clusters.

View Article and Find Full Text PDF

Gene mutation estimations via mutual information and Ewens sampling based CNN & machine learning algorithms.

J Appl Stat

February 2025

Department of Mathematics and State Key Laboratory of Novel Software Technology, Nanjing University, Nanjing, People's Republic of China.

We conduct gene mutation rate estimations via developing mutual information and Ewens sampling based convolutional neural network (CNN) and machine learning algorithms. More precisely, we develop a systematic methodology through constructing a CNN. Meanwhile, we develop two machine learning algorithms to study protein production with target gene sequences and protein structures.

View Article and Find Full Text PDF