Comput Struct Biotechnol J
July 2025
Omics technologies have led to the discovery of a vast number of proteins that are expressed but have no functional annotation - so called hypothetical proteins (HPs). Even in the best-studied model organism K-12, over 2 % of the proteome remains uncharacterized. This knowledge gap becomes even worse when looking at microbial dark matter.
View Article and Find Full Text PDFJ Mol Evol
October 2023
Protein-coding DNA sequences can be translated into completely different amino acid sequences if the nucleotide triplets used are shifted by a non-triplet amount on the same DNA strand or by translating codons from the opposite strand. Such "alternative reading frames" of protein-coding genes are a major contributor to the evolution of novel protein products. Recent studies demonstrating this include examples across the three domains of cellular life and in viruses.
View Article and Find Full Text PDFEukaryotic genomes are pervasively translated, but the properties of translated sequences outside of canonical genes are poorly understood. A new study in Cell Systems reveals a large translatome that is not under significant evolutionary constraint but is still an active part of diverse cellular systems.
View Article and Find Full Text PDFFEMS Microbiol Rev
January 2023
Annotating protein sequences according to their biological functions is one of the key steps in understanding microbial diversity, metabolic potentials, and evolutionary histories. However, even in the best-studied prokaryotic genomes, not all proteins can be characterized by classical in vivo, in vitro, and/or in silico methods-a challenge rapidly growing alongside the advent of next-generation sequencing technologies and their enormous extension of 'omics' data in public databases. These so-called hypothetical proteins (HPs) represent a huge knowledge gap and hidden potential for biotechnological applications.
View Article and Find Full Text PDFThe existence of overlapping genes (OLGs) with significant coding overlaps revolutionizes our understanding of genomic complexity. We report two exceptionally long (957 nt and 1536 nt), evolutionarily novel, translated antisense open reading frames (ORFs) embedded within annotated genes in the pathogenic Gram-negative bacterium . Both OLG pairs show sequence features consistent with being genes and transcriptional signals in RNA sequencing.
View Article and Find Full Text PDFBackground: Overlapping genes (OLGs) with long protein-coding overlapping sequences are disallowed by standard genome annotation programs, outside of viruses. Recently however they have been discovered in Archaea, diverse Bacteria, and Mammals. The biological factors underlying life's ability to create overlapping genes require more study, and may have important applications in understanding evolution and in biotechnology.
View Article and Find Full Text PDFAt least six small alternative-frame open reading frames (ORFs) overlapping well-characterized SARS-CoV-2 genes have been hypothesized to encode accessory proteins. Researchers have used different names for the same ORF or the same name for different ORFs, resulting in erroneous homological and functional inferences. We propose standard names for these ORFs and their shorter isoforms, developed in consultation with the Coronaviridae Study Group of the International Committee on Taxonomy of Viruses.
View Article and Find Full Text PDFMany prokaryotic RNAs are transcribed from loci outside of annotated protein coding genes. Across bacterial species hundreds of short open reading frames antisense to annotated genes show evidence of both transcription and translation, for instance in ribosome profiling data. Determining the functional fraction of these protein products awaits further research, including insights from studies of molecular interactions and detailed evolutionary analysis.
View Article and Find Full Text PDFRibosome profiling (RIBO-Seq) has improved our understanding of bacterial translation, including finding many unannotated genes. However, protocols for RIBO-Seq and corresponding data analysis are not yet standardized. Here, we analyzed 48 RIBO-Seq samples from nine studies of K12 grown in lysogeny broth medium and particularly focused on the size-selection step.
View Article and Find Full Text PDFAntisense transcription is well known in bacteria. However, translation of antisense RNAs is typically not considered, as the implied overlapping coding at a DNA locus is assumed to be highly improbable. Therefore, such overlapping genes are systematically excluded in prokaryotic genome annotation.
View Article and Find Full Text PDFMol Biol Evol
August 2020
Biosystems
November 2019
The genetic code and its evolution have been studied by many different approaches. One approach is to compare the properties of the standard genetic code (SGC) to theoretical alternative codes in order to determine how optimal it is and from this infer whether or not it is likely that it has undergone a selective evolutionary process. Many different properties have been studied in this way in the literature.
View Article and Find Full Text PDFOnly a few overlapping gene pairs are known in the best-analyzed bacterial model organism Escherichia coli. Automatic annotation programs usually annotate only one out of six reading frames at a locus, allowing only small overlaps between protein-coding sequences. However, both RNAseq and RIBOseq show signals corresponding to non-trivially overlapping reading frames in antisense to annotated genes, which may constitute protein-coding genes.
View Article and Find Full Text PDFIn the past, short protein-coding genes were often disregarded by genome annotation pipelines. Transcriptome sequencing (RNAseq) signals outside of annotated genes have usually been interpreted to indicate either ncRNA or pervasive transcription. Therefore, in addition to the transcriptome, the translatome (RIBOseq) of the enteric pathogen Escherichia coli O157:H7 strain Sakai was determined at two optimal growth conditions and a severe stress condition combining low temperature and high osmotic pressure.
View Article and Find Full Text PDF