CARE: context-aware sequencing read error correction.

Bioinformatics

Department of Computer Science, Johannes Gutenberg University, Mainz 55122, Germany.

Published: May 2021


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Motivation: Error correction is a fundamental pre-processing step in many Next-Generation Sequencing (NGS) pipelines, in particular for de novo genome assembly. However, existing error correction methods either suffer from high false-positive rates since they break reads into independent k-mers or do not scale efficiently to large amounts of sequencing reads and complex genomes.

Results: We present CARE-an alignment-based scalable error correction algorithm for Illumina data using the concept of minhashing. Minhashing allows for efficient similarity search within large sequencing read collections which enables fast computation of high-quality multiple alignments. Sequencing errors are corrected by detailed inspection of the corresponding alignments. Our performance evaluation shows that CARE generates significantly fewer false-positive corrections than state-of-the-art tools (Musket, SGA, BFC, Lighter, Bcool, Karect) while maintaining a competitive number of true positives. When used prior to assembly it can achieve superior de novo assembly results for a number of real datasets. CARE is also the first multiple sequence alignment-based error corrector that is able to process a human genome Illumina NGS dataset in only 4 h on a single workstation using GPU acceleration.

Availabilityand Implementation: CARE is open-source software written in C++ (CPU version) and in CUDA/C++ (GPU version). It is licensed under GPLv3 and can be downloaded at https://github.com/fkallen/CARE.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa738DOI Listing

Publication Analysis

Top Keywords

error correction
16
sequencing read
8
sequencing
5
error
5
care
4
care context-aware
4
context-aware sequencing
4
read error
4
correction
4
correction motivation
4

Similar Publications

Introduction: Medical physicists play a critical role in ensuring image quality and patient safety, but their routine evaluations are limited in scope and frequency compared to the breadth of clinical imaging practices. An electronic radiologist feedback system can augment medical physics oversight for quality improvement. This work presents a novel quality feedback system integrated into the Epic electronic medical record (EMR) at a university hospital system, designed to facilitate feedback from radiologists to medical physicists and technologist leaders.

View Article and Find Full Text PDF

ANASFV: a workflow for African swine fever virus whole-genome analysis.

Microb Genom

September 2025

Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, PR China.

African swine fever virus (ASFV) is highly transmissible and can cause up to 100% mortality in pigs. The virus has spread across most regions of Asia and Europe, resulting in the deaths of millions of pigs. A deep understanding of the genetic diversity and evolutionary dynamics of ASFV is necessary to effectively manage outbreaks.

View Article and Find Full Text PDF

In charge detection mass spectrometry (CD-MS) ions are trapped in an electrostatic linear ion trap (ELIT) where they oscillate back and forth through a conducting cylinder. The oscillating ions induce a periodic charge separation that is detected by a charge sensitive amplifier (CSA) connected to the cylinder. The resulting time domain signal is analyzed using short-time Fourier transforms to give the mass-to-charge ratio and charge for each ion, which are then multiplied to give the mass.

View Article and Find Full Text PDF

In ultrasound imaging, propagation of an acoustic wavefront through heterogeneous media causes phase aberrations that degrade the coherence of the reflected wavefront, leading to reduced image resolution and contrast. Adaptive imaging techniques attempt to correct this phase aberration and restore coherence, leading to improved focusing of the image. We propose an autofocusing paradigm for aberration correction in ultrasound imaging by fitting an acoustic velocity field to pressure measurements, via optimization of the common midpoint phase error (CMPE), using a straight-ray wave propagation model for beamforming in diffusely scattering media.

View Article and Find Full Text PDF

Background: Recent studies suggest that large language models (LLMs) such as ChatGPT are useful tools for medical students or residents when preparing for examinations. These studies, especially those conducted with multiple-choice questions, emphasize that the level of knowledge and response consistency of the LLMs are generally acceptable; however, further optimization is needed in areas such as case discussion, interpretation, and language proficiency. Therefore, this study aimed to evaluate the performance of six distinct LLMs for Turkish and English neurosurgery multiple-choice questions and assess their accuracy and consistency in a specialized medical context.

View Article and Find Full Text PDF