Methods Mol Biol
July 2025
Most of the sequence-based structure prediction and experimental structure determination research have been focused on soluble proteins, with relatively less attention paid to the transmembrane proteins. With the availability of high-quality methods for the structure prediction of the soluble proteins, there is a need to fill the gaps for the prediction of transmembrane proteins, motivated by the fact that they are the primary targets of many drug design efforts. We provide a systematic survey of machine learning methods that predict various aspects of topological properties of transmembrane proteins, including helical and beta-barrel proteins.
View Article and Find Full Text PDFProtein function prediction from sequence, structure, gene expression profiles, and published literature are needed to understand all biological processes. Natural language processing of biological text and large language model (LLM)-based encoding of sequence and structure opens powerful paths to rapid function annotation and novel training models. In this survey, we take a look at the available models for function prediction, especially the NLP- and LLM-based models.
View Article and Find Full Text PDFRecent studies have shown that three-dimensional architecture of bacterial chromatin plays an important role in gene expression regulation. However, genome topological organization in , the etiologic agent of tuberculosis, remains unknown. On the other hand, the exact mechanism of differential pathogenesis in the canonical strains of H37Rv and H37Ra remains poorly understood in terms of their raw sequences.
View Article and Find Full Text PDFMethods Mol Biol
July 2024
Differentially expressed genes in a cellular context may be co-regulated by the same transcription factor. However, in the absence of a concurrent transcription factor binding data, such interactions are difficult to detect, especially at the single cell expression level. Motif enrichments in such genes can be used to gain insight into differential expressions caused by the shared upstream TFs.
View Article and Find Full Text PDFComput Biol Chem
October 2024
Spontaneous mutations are evolutionary engines as they generate variants for the evolutionary downstream processes that give rise to speciation and adaptation. Single nucleotide mutations (SNM) are the most abundant type of mutations among them. Here, we perform a meta-analysis to quantify the influence of selected global genomic parameters (genome size, genomic GC content, genomic repeat fraction, number of coding genes, gene count, and strand bias in prokaryotes) and local genomic features (local GC content, repeat content, CpG content and the number of SNM at CpG islands) on spontaneous SNM rates across the tree of life (prokaryotes, unicellular eukaryotes, multicellular eukaryotes) using wild-type sequence data in two different taxon classification systems.
View Article and Find Full Text PDFMoonlighting proteins are multifunctional, single-polypeptide chains capable of performing multiple autonomous functions. Most moonlighting proteins have been discovered through work unrelated to their multifunctionality. We believe that prediction of moonlighting proteins from first principles, that is, using sequence, predicted structure, evolutionary profiles, and global gene expression profiles, for only one functional class of proteins in a single organism at a time will significantly advance our understanding of multifunctional proteins.
View Article and Find Full Text PDFWhole-genome sequencing (WGS) provides a comprehensive tool to analyze the bacterial genomes for genotype-phenotype correlations, diversity of single-nucleotide variant (SNV), and their evolution and transmission. Several online pipelines and standalone tools are available for WGS analysis of () complex (MTBC). While they facilitate the processing of WGS data with minimal user expertise, they are either too general, providing little insights into bacterium-specific issues such as gene variations, INDEL/synonymous/PE-PPE (IDP family), and drug resistance from sample data, or are limited to specific objectives, such as drug resistance.
View Article and Find Full Text PDFComput Struct Biotechnol J
August 2022
Recognition of pathogen-derived nucleic acids by host cells is an effective host strategy to detect pathogenic invasion and trigger immune responses. In the context of pathogen-specific pharmacology, there is a growing interest in mapping the interactions between pathogen-derived nucleic acids and host proteins. Insight into the principles of the structural and immunological mechanisms underlying such interactions and their roles in host defense is necessary to guide therapeutic intervention.
View Article and Find Full Text PDFMalignancies that develop from mucosal epithelium of the upper aerodigestive tract are known as head and neck squamous cell carcinomas (HNSCC). Heterogeneity, late stage diagnosis and high recurrence rate are big hurdles in head and neck treatment regimen. Presently, the biomarkers available for diagnosis and prognosis of HNSCC are based on smoking as the major risk habit.
View Article and Find Full Text PDFSequence-based prediction of DNA-binding residues in a protein is a widely studied problem for which machine learning methods with continuously improving predictive power have been developed. Concatenated rows within a sliding window of a Position Specific Substitution Matrix (PSSM) of the protein are currently used as the primary feature set in almost all the methods of predicting DNA-binding residues. Here we report that these evolutionary profiles are powerful, only for identifying conserved binding sites and fall short for the residue positions which undergo binding to non-binding transitions in closely related proteins.
View Article and Find Full Text PDFThe highly conserved HOX homeodomain (HD) transcription factors (TFs) establish the identity of different body parts along the antero-posterior axis of bilaterian animals. Segment diversification and the morphogenesis of different structures is achieved by generating precise patterns of HOX expression along the antero-posterior axis and by the ability of different HOX TFs to instruct unique and specific transcriptional programs. However, HOX binding properties in vitro, characterised by the recognition of similar AT-rich binding sequences, do not account for the ability of different HOX to instruct segment-specific transcriptional programs.
View Article and Find Full Text PDFFront Med (Lausanne)
July 2021
This scoping review aims to identify the various areas and current status of the application of artificial intelligence (AI) for aiding individuals with cleft lip and/or palate. Cleft lip and/or palate contributes significantly toward the global burden on the healthcare system. Artificial intelligence is a technology that can help individuals with cleft lip and/or palate, especially those in areas with limited access to receive adequate care.
View Article and Find Full Text PDFtRNA methyltransferase 5 (Trm5) enzyme is an S-adenosyl methionine (AdoMet)-dependent methyltransferase which methylates the G37 nucleotide at the N atom of the tRNA. The free form of Trm5 enzyme has three intrinsically disordered regions, which are highly flexible and lack stable three-dimensional structures. These regions gain ordered structures upon the complex formation with tRNA, also called disorder-to-order transition (DOT) regions.
View Article and Find Full Text PDFJ Biomol Struct Dyn
October 2022
Intrinsically disordered regions (IDRs) in proteins are characterized by their flexibilities and low complexity regions, which lack unique 3 D structures in solution. IDRs play a significant role in signaling, regulation, and binding multiple partners, including DNA, RNA, and proteins. Although various experiments have shown the role of disordered regions in binding with RNA, a detailed computational analysis is required to understand their binding and recognition mechanism.
View Article and Find Full Text PDFSingle-cell transcriptomics data, when combined with in situ hybridization patterns of specific genes, can help in recovering the spatial information lost during cell isolation. Dialogue for Reverse Engineering Assessments and Methods (DREAM) consortium conducted a crowd-sourced competition known as DREAM Single Cell Transcriptomics Challenge (SCTC) to predict the masked locations of single cells from a set of 60, 40 and 20 genes out of 84 in situ gene patterns known in embryo. We applied a genetic algorithm (GA) to predict the most important genes that carry positional and proximity information of the single-cell origins, in combination with the base distance mapping algorithm DistMap.
View Article and Find Full Text PDFSepsis is a systemic inflammatory disorder induced by a dysregulated immune response to infection resulting in dysfunction of multiple critical organs, including the intestines. Previous studies have reported contrasting results regarding the abilities of exosomes circulating in the blood of sepsis mice and patients to either promote or suppress inflammation. Little is known about how the gut epithelial cell-derived exosomes released in the intestinal luminal space during sepsis affect mucosal inflammation.
View Article and Find Full Text PDFInform Med Unlocked
June 2020
The COVID-19 pandemic is a serious and global public health concern. It is now well known that COVID-19 cases may result in mild symptoms leading to patient recovery. However, severity of infection, fatality rates, and treatment responses across different countries, age groups, and demographic groups suggest that the nature of infection is diverse, and a timely investigation of the same is needed for evolving sound treatment and preventive strategies.
View Article and Find Full Text PDFAn amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFThis study investigated the potential role of a nitrogen-fixing early-coloniser Alnus Nepalensis D. Don (alder) in driving the changes in soil bacterial communities during secondary succession. We found that bacterial diversity was positively associated with alder growth during course of ecosystem development.
View Article and Find Full Text PDFInt J Biol Macromol
May 2020
Aminoacyl tRNA synthetase (AARS) plays an important role in transferring each amino acid to its cognate tRNA. Specifically, tyrosyl tRNA synthetase (TyrRS) is involved in various functions including protection from DNA damage due to oxidative stress, protein synthesis and cell signaling and can be an attractive target for controlling the pathogens by early inhibition of translation. TyrRS has two disordered regions, which lack a stable 3D structure in solution, and are involved in tRNA synthetase catalysis and stability.
View Article and Find Full Text PDFStroke causes behavioral deficits in multiple cognitive domains and there is a growing interest in predicting patient performance from neuroimaging data using machine learning techniques. Here, we investigated a deep learning approach based on convolutional neural networks (CNNs) for predicting the severity of language disorder from 3D lesion images from magnetic resonance imaging (MRI) in a heterogeneous sample of stroke patients. CNN performance was compared to that of conventional (shallow) machine learning methods, including ridge regression (RR) on the images' principal components and support vector regression.
View Article and Find Full Text PDFSequence based DNA-binding protein (DBP) prediction is a widely studied biological problem. Sliding windows on position specific substitution matrices (PSSMs) rows predict DNA-binding residues well on known DBPs but the same models cannot be applied to unequally sized protein sequences. PSSM summaries representing column averages and their amino-acid wise versions have been effectively used for the task, but it remains unclear if these features carry all the PSSM's predictive power, traditionally harnessed for binding site predictions.
View Article and Find Full Text PDF