Background: Curated databases of genetic variants assist clinicians and researchers in interpreting genetic variation. Yet, these databases contain some misclassified variants. It is unclear whether variant misclassification is abating as these databases rapidly grow and implement new guidelines.
View Article and Find Full Text PDFNewborn screening (NBS) is a population-based program with a goal of reducing the burden of disease for conditions with significant clinical impact on neonates. Screening tests were originally developed and implemented one at a time, but newer methods have allowed the use of multiplex technologies to expand additions more rapidly to standard panels. Recent improvements in next-generation sequencing are also evolving rapidly from first focusing on individual genes, then panels, and finally all genes as encompassed by whole exome and genome sequencing.
View Article and Find Full Text PDFGenome sequencing is enabling precision medicine-tailoring treatment to the unique constellation of variants in an individual's genome. The impact of recurrent pathogenic variants is often understood, however there is a long tail of rare genetic variants that are uncharacterized. The problem of uncharacterized rare variation is especially acute when it occurs in genes of known clinical importance with functionally consequential variants and associated mechanisms.
View Article and Find Full Text PDFShort-chain acyl-CoA dehydrogenase deficiency (SCADD) is a rare autosomal recessive disorder of β-oxidation caused by pathogenic variants in the gene. Analyte testing for SCADD in blood and urine, including newborn screening (NBS) using tandem mass spectrometry (MS/MS) on dried blood spots (DBSs), is complicated by the presence of two relatively common variants (c.625G>A and c.
View Article and Find Full Text PDFPublic health newborn screening (NBS) programs provide population-scale ascertainment of rare, treatable conditions that require urgent intervention. Tandem mass spectrometry (MS/MS) is currently used to screen newborns for a panel of rare inborn errors of metabolism (IEMs). The NBSeq project evaluated whole-exome sequencing (WES) as an innovative methodology for NBS.
View Article and Find Full Text PDFWhole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes.
View Article and Find Full Text PDFGenome sequencing identifies vast number of genetic variants. Predicting these variants' molecular and clinical effects is one of the preeminent challenges in human genetics. Accurate prediction of the impact of genetic variants improves our understanding of how genetic information is conveyed to molecular and cellular functions, and is an essential step towards precision medicine.
View Article and Find Full Text PDFWe present a computational model for predicting mutational impact on enzymatic activity of human acid α-glucosidase (GAA), an enzyme associated with Pompe disease. Using a model that combines features specific to GAA with other general evolutionary and physiochemical features, we made blind predictions of enzymatic activity relative to wildtype human GAA for >300 GAA mutants, as part of the Critical Assessment of Genome Interpretation 5 GAA challenge. We found that gene-specific features can improve the performance of existing impact prediction tools that mostly rely on general features for pathogenicity prediction.
View Article and Find Full Text PDFThe integrative analysis of high-throughput reporter assays, machine learning, and profiles of epigenomic chromatin state in a broad array of cells and tissues has the potential to significantly improve our understanding of noncoding regulatory element function and its contribution to human disease. Here, we report results from the CAGI 5 regulation saturation challenge where participants were asked to predict the impact of nucleotide substitution at every base pair within five disease-associated human enhancers and nine disease-associated promoters. A library of mutations covering all bases was generated by saturation mutagenesis and altered activity was assessed in a massively parallel reporter assay (MPRA) in relevant cell lines.
View Article and Find Full Text PDFN Engl J Med
December 2016
Background: Severe combined immunodeficiency (SCID) is characterized by arrested T-lymphocyte production and by B-lymphocyte dysfunction, which result in life-threatening infections. Early diagnosis of SCID through population-based screening of newborns can aid clinical management and help improve outcomes; it also permits the identification of previously unknown factors that are essential for lymphocyte development in humans.
Methods: SCID was detected in a newborn before the onset of infections by means of screening of T-cell-receptor excision circles, a biomarker for thymic output.
We determined the NMR structure of a highly aromatic (13%) protein of unknown function, Aq1974 from Aquifex aeolicus (PDB ID: 5SYQ). The unusual sequence of this protein has a tryptophan content five times the normal (six tryptophan residues of 114 or 5.2% while the average tryptophan content is 1.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
July 2015
Experimental and computational folding studies of Proteins L & G and NuG2 typically find that sequence differences determine which of the two hairpins is formed in the transition state ensemble (TSE). However, our recent work on Protein L finds that its TSE contains both hairpins, compelling a reassessment of the influence of sequence on the folding behavior of the other two homologs. We characterize the TSEs for Protein G and NuG2b, a triple mutant of NuG2, using ψ analysis, a method for identifying contacts in the TSE.
View Article and Find Full Text PDFWe demonstrate the ability of simultaneously determining a protein's folding pathway and structure using a properly formulated model without prior knowledge of the native structure. Our model employs a natural coordinate system for describing proteins and a search strategy inspired by the observation that real proteins fold in a sequential fashion by incrementally stabilizing nativelike substructures or "foldons." Comparable folding pathways and structures are obtained for the twelve proteins recently studied using atomistic molecular dynamics simulations [K.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
October 2012
Motivated by the relationship between the folding mechanism and the native structure, we develop a unified approach for predicting folding pathways and tertiary structure using only the primary sequence as input. Simulations begin from a realistic unfolded state devoid of secondary structure and use a chain representation lacking explicit side chains, rendering the simulations many orders of magnitude faster than molecular dynamics simulations. The multiple round nature of the algorithm mimics the authentic folding process and tests the effectiveness of sequential stabilization (SS) as a search strategy wherein 2° structural elements add onto existing structures in a process of progressive learning and stabilization of structure found in prior rounds of folding.
View Article and Find Full Text PDFProtein Sci
January 2012
Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ≤12 residues in crystal structures.
View Article and Find Full Text PDFWe studied the temperature dependence of the structural relaxation in poly(vinyl acetate) near the glass transition temperature with single molecule spectroscopy from Tg-1 K to Tg+12 K. The temperature dependence of the observed relaxation times matches results from bulk experiments; the observed relaxation times are, however, 80-fold slower than those from bulk experiments at the same temperature. We attribute this factor to the size of the probe molecule.
View Article and Find Full Text PDF