UrQt: an efficient software for the Unsupervised Quality trimming of NGS data.

BMC Bioinformatics

Université de Lyon; Université Lyon 1; CNRS; UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, 43 bd du 11 novembre 1918, Villeurbanne cedex, 69622, France.

Published: April 2015


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Background: Quality control is a necessary step of any Next Generation Sequencing analysis. Although customary, this step still requires manual interventions to empirically choose tuning parameters according to various quality statistics. Moreover, current quality control procedures that provide a "good quality" data set, are not optimal and discard many informative nucleotides. To address these drawbacks, we present a new quality control method, implemented in UrQt software, for Unsupervised Quality trimming of Next Generation Sequencing reads.

Results: Our trimming procedure relies on a well-defined probabilistic framework to detect the best segmentation between two segments of unreliable nucleotides, framing a segment of informative nucleotides. Our software only requires one user-friendly parameter to define the minimal quality threshold (phred score) to consider a nucleotide to be informative, which is independent of both the experiment and the quality of the data. This procedure is implemented in C++ in an efficient and parallelized software with a low memory footprint. We tested the performances of UrQt compared to the best-known trimming programs, on seven RNA and DNA sequencing experiments and demonstrated its optimality in the resulting tradeoff between the number of trimmed nucleotides and the quality objective.

Conclusions: By finding the best segmentation to delimit a segment of good quality nucleotides, UrQt greatly increases the number of reads and of nucleotides that can be retained for a given quality objective. UrQt source files, binary executables for different operating systems and documentation are freely available (under the GPLv3) at the following address: https://lbbe.univ-lyon1.fr/-UrQt-.html .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4450468PMC
http://dx.doi.org/10.1186/s12859-015-0546-8DOI Listing

Publication Analysis

Top Keywords

quality control
12
quality
11
software unsupervised
8
unsupervised quality
8
quality trimming
8
generation sequencing
8
informative nucleotides
8
best segmentation
8
nucleotides
6
urqt
5

Similar Publications

Spirulina is considered a superfood due to its chlorophylls. Two new methods for the determination of chlorophylls and β-carotene were developed here, one based on in-tube solid-phase microextraction (IT-SPME) coupled online to nanoliquid chromatography (nanoLC) with diode array detection (DAD), and the other on ultraviolet-visible diffuse reflectance spectroscopy (UV-vis DRS). A protocol to extract the pigments from spirulina was proposed using ethanol (1.

View Article and Find Full Text PDF

LONP1 Variants Are Associated With Clinically Diverse Phenotypes.

Clin Genet

September 2025

Department of Pediatrics, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

LONP1 encodes a mitochondrial protease essential for protein quality control and metabolism. Variants in LONP1 are associated with a diverse and expanding spectrum of disorders, including Cerebral, Ocular, Dental, Auricular, and Skeletal anomalies syndrome (CODAS), congenital diaphragmatic hernia (CDH), and neurodevelopmental disorders (NDD), with some individuals exhibiting features of mitochondrial encephalopathy. We report 16 novel LONP1 variants identified in 16 individuals (11 with NDD, 5 with CDH), further expanding the clinical spectrum.

View Article and Find Full Text PDF

Slowing down the clock on ovarian aging-does the ovary hold the secret to the fountain of youth?

Geroscience

September 2025

NUS Bia-Echo Asia Centre for Reproductive Longevity and Equality, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.

In the past century, the human Lifespan has doubled. However, this is not equivalent to Healthspan which refers to the number of years spent healthy and free from disease. Women have an additional level of complexity on the path to optimal healthspan where health resilience dramatically decreases following menopause and this is due to their ovaries aging by midlife.

View Article and Find Full Text PDF

Purpose: Patients diagnosed with high-grade gliomas (HGG) often experience substantial psychosocial dis-tress. However, due to neurological and neurocognitive deficits its assessment remains challenging, and needs remain unmet. We compared a novel face-to-face assessment during doctor-patient conversations with questionnaire-based screening.

View Article and Find Full Text PDF

Pharmacological modulation of glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic polypeptide (GIP) through dual GIP/GLP-1 receptor agonists, commonly used for diabetes and obesity, shows promise in reducing alcohol consumption. We applied drug-target Mendelian randomization (MR) using genetic variation at these loci to assess their long-term effects on problematic alcohol use (PAU), binge drinking, alcohol misuse classifications, liver health, and other substance use behaviors. Genetic proxies for lowered BMI, modeling the appetite-suppressing and weight-reducing effects of variants in both the GIPR and GLP1R loci ("GIPR/GLP1R"), were linked with reduced binge drinking in the primary (β = -0.

View Article and Find Full Text PDF