Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1-10 minutes. Testing FCS-GX on artificially fragmented genomes demonstrates sensitivity >95% for diverse contaminant species and specificity >99.93%. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination (0.16% of total bases), with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10246020PMC
http://dx.doi.org/10.1101/2023.06.02.543519DOI Listing

Publication Analysis

Top Keywords

fcs-gx
6
rapid sensitive
4
sensitive detection
4
detection genome
4
contamination
4
genome contamination
4
contamination scale
4
scale fcs-gx
4
fcs-gx assembled
4
assembled genome
4

Similar Publications

Rapid and sensitive detection of genome contamination at scale with FCS-GX.

Genome Biol

February 2024

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI's Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.

View Article and Find Full Text PDF
Article Synopsis
  • FCS-GX is a new tool developed by NCBI to quickly identify and remove contamination from genomic sequences.
  • It efficiently screens genomes in a short time (0.1-10 minutes) and has high sensitivity (>95%) and specificity (>99.93%) for detecting various contaminant species.
  • The tool was used to analyze 1.6 million GenBank assemblies, uncovering 36.8 Gbp of contamination, which led to improved genome accuracy in NCBI's databases.
View Article and Find Full Text PDF