Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The attachment of unique molecular identifiers (UMIs) to RNA molecules prior to PCR amplification and sequencing, makes it possible to amplify libraries to a level that is sufficient to identify rare molecules, whilst simultaneously eliminating PCR bias through the identification of duplicated reads. Accurate de-duplication is dependent upon a sufficiently complex pool of UMIs to allow unique labelling. In applications dealing with complex libraries, such as total RNA-seq, only a limited variety of UMIs are required as the variation in molecules to be sequenced is enormous. However, when sequencing a less complex library, such as small RNAs for which there is a more limited range of possible sequences, we find increased variation in UMIs are required, even beyond that provided in a commercial kit specifically designed for the preparation of small RNA libraries for sequencing. We show that a pool of UMIs randomly varying across eight nucleotides is not of sufficient depth to uniquely tag the microRNAs to be sequenced. This results in over de-duplication of reads and the marked under-estimation of expression of the more abundant microRNAs. Whilst still arguing for the utility of UMIs, this work demonstrates the importance of their considered design to avoid errors in the estimation of gene expression in libraries derived from select regions of the transcriptome or small genomes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7471316PMC
http://dx.doi.org/10.1038/s41598-020-71323-0DOI Listing

Publication Analysis

Top Keywords

identifiers umis
8
small rna
8
pool umis
8
umis required
8
umis
7
insufficiently complex
4
complex unique-molecular
4
unique-molecular identifiers
4
umis distort
4
small
4

Similar Publications

Background: Non-invasive fetal HPA typing is a valuable tool to identify the pregnancies at risk of fetal and neonatal alloimmune thrombocytopenia (FNAIT). Different approaches have been developed, mainly based on real-time PCR and droplet digital-PCR. Those methods have a limited ability to multiplex and require replicates due to the contamination risk.

View Article and Find Full Text PDF

Long-read single-cell RNA sequencing using platforms such as Oxford Nanopore Technologies (ONT) enables full-length transcriptome profiling at single-cell resolution. However, high sequencing error rates, diverse library architectures, and increasing dataset scale introduce major challenges for accurately identifying cell barcodes (CBCs) and unique molecular identifiers (UMIs) - key prerequisites for reliable demultiplexing and deduplication, respectively. Existing pipelines rely on hard-coded heuristics or local transition rules that cannot fully capture this broader structural context and often fail to robustly interpret reads with indel-induced shifts, truncated segments, or non-canonical element ordering.

View Article and Find Full Text PDF

The spatial heterogeneity of gene expression has driven the development of diverse spatial transcriptomics technologies. Here, we present photocleavage and ligation sequencing (PCL-seq), a spatial indexing method utilizing a light-controlled DNA labeling strategy applied to tissue sections. PCL-seq employs photocleavable oligonucleotides and ligation adapters to construct transcriptional profiles of specific regions of interest (ROIs) designated via microscopically controlled photo-illumination.

View Article and Find Full Text PDF

In this paper, we have evaluated a targeted high-throughput massive parallel sequencing approach for detecting single nucleotide mutations or small genomic changes generated by new genomic techniques (NGT). We used unique molecular identifiers (UMIs) for the quantification of the mutant alleles and duplex sequencing to confirm a mutation on both strands to avoid polymerase chain reaction (PCR) artefacts or sequencing miss-calls. We tested the approach in blinded analyses on a set of mixed NGT-modified tomato lines and identified each single nucleotide mutation or small insert/deletion (InDel) down to a 0.

View Article and Find Full Text PDF

Lymphocytes use somatic diversification processes to express a wide variability of antigen receptors, generating a highly diversified repertoire that is unique to each individual. The study of these repertoires is now possible with the advent of next-generation sequencing (NGS) techniques. Here we describe the "RACE Rep-Seq" methodology for high-throughput sequencing of immunoglobulin (Ig) repertoires using RNA templates.

View Article and Find Full Text PDF