Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Unlabelled: Ribosome profiling (Ribo-seq) has proven transformative for our understanding of the human genome and proteome by illuminating thousands of non-canonical sites of ribosome translation outside of the currently annotated coding sequences (CDSs). A conservative estimate suggests that at least 7,000 non-canonical open reading frames (ORFs) are translated, which, at first glance, has the potential to expand the number of human protein-coding sequences by 30%, from ∼19,500 annotated CDSs to over 26,000. Yet, additional scrutiny of these ORFs has raised numerous questions about what fraction of them truly produce a protein product and what fraction of those can be understood as proteins according to conventional understanding of the term. Adding further complication is the fact that published estimates of non-canonical ORFs vary widely by around 30-fold, from several thousand to several hundred thousand. The summation of this research has left the genomics and proteomics communities both excited by the prospect of new coding regions in the human genome, but searching for guidance on how to proceed. Here, we discuss the current state of non-canonical ORF research, databases, and interpretation, focusing on how to assess whether a given ORF can be said to be "protein-coding".

In Brief: The human genome encodes thousands of non-canonical open reading frames (ORFs) in addition to protein-coding genes. As a nascent field, many questions remain regarding non-canonical ORFs. How many exist? Do they encode proteins? What level of evidence is needed for their verification? Central to these debates has been the advent of ribosome profiling (Ribo-seq) as a method to discern genome-wide ribosome occupancy, and immunopeptidomics as a method to detect peptides that are processed and presented by MHC molecules and not observed in traditional proteomics experiments. This article provides a synthesis of the current state of non-canonical ORF research and proposes standards for their future investigation and reporting.

Highlights: Combined use of Ribo-seq and proteomics-based methods enables optimal confidence in detecting non-canonical ORFs and their protein products.Ribo-seq can provide more sensitive detection of non-canonical ORFs, but data quality and analytical pipelines will impact results.Non-canonical ORF catalogs are diverse and span both high-stringency and low-stringency ORF nominations.A framework for standardized non-canonical ORF evidence will advance the research field.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10245706PMC
http://dx.doi.org/10.1101/2023.05.16.541049DOI Listing

Publication Analysis

Top Keywords

non-canonical orfs
16
human genome
12
non-canonical orf
12
non-canonical
11
ribosome profiling
8
profiling ribo-seq
8
thousands non-canonical
8
non-canonical open
8
open reading
8
reading frames
8

Similar Publications

Mitosis is a critical phase of the cell cycle and a vulnerable point where cancer cells can be disrupted, causing cell death and inhibiting tumor growth. Challenges such as drug resistance persist in clinical applications. During mitosis, mRNA translation is generally downregulated, while non-canonical translation of specific transcripts continues.

View Article and Find Full Text PDF

Characterization of an unusual carlavirus-like RNA from papaya (Carica papaya) lacking essential genes.

PLoS One

August 2025

Facultad de Ciencias de la Vida, Escuela Superior Politécnica del Litoral, ESPOL, Km 30.5 Vía Perimetral, Campus Gustavo Galindo, Guayaquil, Ecuador.

We report the characterization of a novel carlavirus-like RNA, provisionally named papaya defective virus 1 (PapDfV1), identified through high-throughput sequencing of papaya latex RNA. PapDfV1 contains two open reading frames (ORFs): ORF 1 encodes a 211.1 kDa replicase with 96% sequence identity to Zhejiang betaflexivirus 2 (ZhBV2), while ORF 2 exhibits a chimeric structure with regions homologous to two distinct ORFs of ZhBV2.

View Article and Find Full Text PDF

Non-canonical (i.e., unannotated) open reading frames (ncORFs) have until recently been omitted from reference genome annotations, despite evidence of their translation, limiting their incorporation into biomedical research.

View Article and Find Full Text PDF

While protein synthesis typically initiates at an optimal AUG start codon, the 5' untranslated region (5'UTR) of mRNAs harbors non-canonical start codons that result in the translation of upstream Open Reading Frames (uORFs). However, the mechanisms underlying the selection of non-canonical start codons remain poorly understood. Structural analysis of translation pre-initiation complexes showed that the 2'OH group of the first nucleotide within start codons is monitored by 18S rRNA, allowing optimal translation initiation.

View Article and Find Full Text PDF

Suboptimal dengue genome leverages non-canonical translation mechanisms.

iScience

May 2025

Division of Immunology and Infectious Disease Biology, INtegrative GENomics of HOst-PathogEn (INGEN-HOPE) Laboratory, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Mall Road, Delhi 110007, India.

Dengue is a notable example of vector-borne RNA virus responsible for severe hemorrhagic fever. Its compact genome necessitates reliance on the host's translational machinery for replication. This study investigates the plausible adaptive strategies employed by dengue serotypes for effective translation within the human host.

View Article and Find Full Text PDF