98%
921
2 minutes
20
sORF-encoded peptides (SEPs) refer to proteins encoded by small open reading frames (sORFs) with a length of less than 100 amino acids, which play an important role in various life activities. Analysis of known SEPs showed that using non-canonical initiation codons of SEPs was more common. However, the current analysis of SEP sequences mainly relies on bioinformatics prediction, and most of them use AUG as the start site, which may not be completely correct for SEPs. Chemical labeling was used to systematically analyze the N-terminal sequences of SEPs to accurately define the start sites of SEPs. By comparison, we found that dimethylation and guanidinylation are more efficient than acetylation. The ACN precipitation and heating precipitation performed better in SEP enrichment. As an N-terminal peptide enrichment material, Hexadhexaldehyde was superior to CNBr-activated agarose and NHS-activated agarose. Combining these methods, we identified 128 SEPs with 131 N-terminal sequences. Among them, two-thirds are novel N-terminal sequences, and most of them start from the 11-31st amino acids of the original sequence. Partial novel N-termini were produced by proteolysis or signal peptide removal. Some SEPs' transcription start sites were corrected to be non-AUG start codons. One novel start codon was validated using GFP-tag vectors. These results demonstrated that the chemical labeling approaches would be beneficial for identifying the start codons of sORFs and the real N-terminal of their encoded peptides, which helps better understand the characterization of SEPs.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11602987 | PMC |
http://dx.doi.org/10.1016/j.mcpro.2024.100860 | DOI Listing |
Planta
September 2025
Department of Biology, University of Naples Federico II, Via Cinthia 26, 80126, Naples, Italy.
The first complete plastid genome of the critically endangered species Valeriana trinervis was sequenced, assembled and compared with other published Valeriana plastomes. In this study, we assembled the plastid genome of the critically endangered, endemic species Valeriana trinervis (= Centranthus trinervis) and compare it with all published plastomes of Valeriana. We found not only differences in the inverted repeats boundaries, in the type and abundance of repeats, but also similarities in codon usage and microsatellite numbers.
View Article and Find Full Text PDFInt J Biol Macromol
September 2025
Supercomputing Facility for Bioinformatics & Computational Biology (SCFBio) & Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, 110016, India; Department of Chemistry, Indian Institute of Technology, Delhi, 110016, India. Electronic address:
DNA is a dynamic molecule composed of numerous genic and regulatory elements that orchestrate cellular functions. Traditional methods often fail to provide accurate functional genome annotations because they do not effectively account for sequence variability within and across different organisms. To address this, we conducted an extensive genomic physical fingerprinting of ~4.
View Article and Find Full Text PDFGenomics
August 2025
State Key Lab of Aridland Crop Science, Gansu Key Lab of Crop Improvement and Germplasm Enhancement, Lanzhou, China; Department of Crop Genetics and Breeding, College of Agronomy, Gansu Agricultural University, Lanzhou, China. Electronic address:
Halogeton glomeratus, a halophytic species in the Amaranthaceae family, is well adapted to extreme saline-alkaline environments. To better understand its adaptive mechanisms at the genomic level, we assembled and analyzed the complete mitochondrial genome (mitogenome) of H. glomeratus.
View Article and Find Full Text PDFJ Chem Inf Model
September 2025
Information Materials and Intelligent Sensing Laboratory of Anhui Province, Institutes of Physical Science and Information Technology, Anhui University, Hefei, Anhui 230601, China.
Start loss variants occur at the start codon that can disrupt the normal translation initiation process, potentially resulting in the production of abnormal protein isoforms. Although numerous computational methods have been developed to aid in the large-scale interpretation of genetic variants, they often show limited predictive accuracy for start-loss variants. A significant limitation of the majority of these methods is their dependence on manually curated features, which restricts their ability to predict variants that have not been studied and characterized.
View Article and Find Full Text PDFBMC Bioinformatics
August 2025
Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800, Kongens Lyngby, Denmark.
Background: Accurate identification of translation initiation sites is essential for the proper translation of mRNA into functional proteins. In eukaryotes, the choice of the translation initiation site is influenced by multiple factors, including its proximity to the 5[Formula: see text] end and the local start codon context. Translation initiation sites mark the transition from non-coding to coding regions.
View Article and Find Full Text PDF