Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Small proteins (≤100 amino acids) play important roles across all life forms, ranging from unicellular bacteria to higher organisms. In this study, we have developed SProtFP which is a machine learning-based method for functional annotation of prokaryotic small proteins into selected functional categories. SProtFP uses independent artificial neural networks (ANNs) trained using a combination of physicochemical descriptors for classifying small proteins into antitoxin type 2, bacteriocin, DNA-binding, metal-binding, ribosomal protein, RNA-binding, type 1 toxin and type 2 toxin proteins. We have also trained a model for identification of small open reading frame (smORF)-encoded antimicrobial peptides (AMPs). Comprehensive benchmarking of SProtFP revealed an average area under the receiver operator curve (ROC-AUC) of 0.92 during 10-fold cross-validation and an ROC-AUC of 0.94 and 0.93 on held-out balanced and imbalanced test sets. Utilizing our method to annotate bacterial isolates from the human gut microbiome, we could identify thousands of remote homologs of known small protein families and assign putative functions to uncharacterized proteins. This highlights the utility of SProtFP for large-scale functional annotation of microbiome datasets, especially in cases where sequence homology is low. SProtFP is freely available at http://www.nii.ac.in/sprotfp.html and can be combined with genome annotation tools such as ProsmORF-pred to uncover the functional repertoire of novel small proteins in bacteria.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11704790PMC
http://dx.doi.org/10.1093/nargab/lqae186DOI Listing

Publication Analysis

Top Keywords

small proteins
16
sprotfp machine
8
machine learning-based
8
learning-based method
8
method functional
8
functional annotation
8
type toxin
8
small
7
sprotfp
6
proteins
6

Similar Publications

Osteoarthritis (OA) is a prevalent chronic disease, characterized by progressive joint degeneration and primarily affects older adults. OA leads to reduced functional abilities, a lower quality of life, and an increased mortality rate. Currently, effective treatment options for OA are lacking.

View Article and Find Full Text PDF

Introduction: Primary central nervous system vasculitis (primary CNS vasculitis) is a rare inflammatory disorder that affects small-to-medium-sized cerebral vessels, often leading to recurrent strokes. Diagnosis is vague due to non-specific neurological symptoms. Imaging findings, cerebrospinal fluid (CSF) analysis and exclusion of systemic vasculitis are essential for diagnosis.

View Article and Find Full Text PDF

Hayata 1916 is a unique bamboo species endemic to Taiwan, typically found at elevations ranging from 500 to 1,500 meters. This study provides a detailed analysis of the complete chloroplast genome of for the first time. The genome spans 139,664 base pairs (bp) and consists of a large single-copy (LSC) region of 83,192 bp, a small single-copy (SSC) region of 12,869 bp, and two inverted repeat (IR) regions, each 21,798 bp in length.

View Article and Find Full Text PDF

Integrative profiling of lung cancer biomarkers EGFR, ALK, KRAS, and PD-1 with emphasis on nanomaterials-assisted immunomodulation and targeted therapy.

Front Immunol

September 2025

Department of Thoracic Surgery, Shenzhen People's Hospital (The First Affiliated Hospital, Southern University of Science and Technology; The Second Clinical Medical College, Jinan University), Shenzhen, Guangdong, China.

Background: Lung cancer remains the leading cause of cancer-related mortality globally, primarily due to late-stage diagnosis, molecular heterogeneity, and therapy resistance. Key biomarkers such as EGFR, ALK, KRAS, and PD-1 have revolutionized precision oncology; however, comprehensive structural and clinical validation of these targets is crucial to enhance therapeutic efficacy.

Methods: Protein sequences for EGFR, ALK, KRAS, and PD-1 were retrieved from UniProt and modeled using SWISS-MODEL to generate high-confidence 3D structures.

View Article and Find Full Text PDF

Ulcerative colitis (UC) is a chronic inflammatory bowel disease, the incidence of which continues to rise globally, and existing therapeutic options are limited by low drug bioavailability and systemic side effects. In this study, we systematically investigated the challenges of the special gastrointestinal environment of UC patients for oral drug delivery, such as extreme pH, degradation by digestive enzymes, metabolism of intestinal flora and obstruction of the intestinal mucosal barrier, and summarized the potential of plant-derived Exosome-like Nanovesicles (PELNs) as a novel delivery system. PELNs are produced by plant cells and mainly consist of proteins, RNA, lipids and plant active molecules.

View Article and Find Full Text PDF