Article Synopsis

  • The study highlights the need for standardized DNA datasets in cancer genomics to improve sequencing pipelines and algorithm performance.
  • The authors present reference call sets derived from paired tumor-normal genomic DNA samples from a breast cancer cell line, known for its genetic diversity and alterations.
  • These reference samples allow for better bias minimization in sequencing technologies and serve as a valuable resource for benchmarking tumor analysis methods, despite not being representative of primary clinical cancer cells.

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The lack of samples for generating standardized DNA datasets for setting up a sequencing pipeline or benchmarking the performance of different algorithms limits the implementation and uptake of cancer genomics. Here, we describe reference call sets obtained from paired tumor-normal genomic DNA (gDNA) samples derived from a breast cancer cell line-which is highly heterogeneous, with an aneuploid genome, and enriched in somatic alterations-and a matched lymphoblastoid cell line. We partially validated both somatic mutations and germline variants in these call sets via whole-exome sequencing (WES) with different sequencing platforms and targeted sequencing with >2,000-fold coverage, spanning 82% of genomic regions with high confidence. Although the gDNA reference samples are not representative of primary cancer cells from a clinical sample, when setting up a sequencing pipeline, they not only minimize potential biases from technologies, assays and informatics but also provide a unique resource for benchmarking 'tumor-only' or 'matched tumor-normal' analyses.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8532138PMC
http://dx.doi.org/10.1038/s41587-021-00993-6DOI Listing

Publication Analysis

Top Keywords

call sets
12
reference samples
8
setting sequencing
8
sequencing pipeline
8
sequencing
6
establishing community
4
community reference
4
samples
4
samples data
4
data call
4

Similar Publications

Background: On October 7, 2023, approximately 2,500 Hamas terrorists infiltrated southern Israel from Gaza. Over 1,200 people were killed and 1600 were injured in the largest mass casualty incident (MCI) in Israel's history. Emergency departments (EDs) throughout the country were overwhelmed with patients and working under missile fire.

View Article and Find Full Text PDF

Background: Foot-and-mouth disease virus (FMDV) is capable of causing explosive outbreaks among domestic and wild cloven-hoofed animals. Genomic characterisation of FMDV is a crucial component of disease control enabling accurate tracing of disease outbreaks to be undertaken. Nanopore sequencing is an affordable and accessible form of high-throughput sequencing (HTS) technology.

View Article and Find Full Text PDF

Early diagnosis of atrial septal defects (ASDs) from chest X-ray (CXR) images with high accuracy is vital. This study created a dataset from chest X-ray images obtained from different adult subjects. To diagnose atrial septal defects with very high accuracy, which we call state-of-the-art technology, the method known as the Origami paper folding technique, which was used for the first time in the literature on our dataset, was used for data augmentation.

View Article and Find Full Text PDF

Ferroptosis, an iron-dependent form of oxidative cell death, plays a critical role in cancer progression and immune regulation. However, the functional connections of ferroptosis with specific immune cell types remain poorly defined, limiting the future possibilities to harness ferroptosis for cancer biology, diagnosis, and treatment. To address this knowledge gap, we conducted an integrated transcriptomic analysis to investigate ferroptosis-related immune dynamics in gastric cancer (GC).

View Article and Find Full Text PDF

The minimizer of a $k$-mer is the smallest $m$-mer inside the $k$-mer according to some total order $< $ of the $m$-mers. Minimizers are often used as keys in hash tables in indexing tasks in metagenomics and pangenomics. The main weakness of minimizer-based indexing is the possibility of very frequently occurring minimizers, which can slow query times down significantly.

View Article and Find Full Text PDF