We have used multiple sequencing approaches to sequence the genome of a volunteer from Saudi Arabia. We use the resulting data to generate a de novo assembly of the genome, and use different computational approaches to refine the assembly. As a consequence, we provide a contiguous assembly of the complete genome of an individual from Saudi Arabia for all chromosomes except chromosome Y, and label this assembly KSA001.
View Article and Find Full Text PDFCurrently, different sequencing platforms are used to generate plant genomes and no workflow has been properly developed to optimize time, cost, and assembly quality. We present LeafGo, a complete de novo plant genome workflow, that starts from tissue and produces genomes with modest laboratory and bioinformatic resources in approximately 7 days and using one long-read sequencing technology. LeafGo is optimized with ten different plant species, three of which are used to generate high-quality chromosome-level assemblies without any scaffolding technologies.
View Article and Find Full Text PDFPeriodic fever with aphthous stomatitis, pharyngitis, and cervical adenitis (PFAPA) is a relatively common autoinflammatory condition that primarily affects children. Although tendencies were reported for this syndrome, genetic variations influencing risk and disease progression are poorly understood. In this study, we performed next-generation sequencing for 82 unrelated PFAPA patients and identified a frameshift variant in the gene (CARD8-FS).
View Article and Find Full Text PDFBackground: Macular corneal dystrophy (MCD) is a rare autosomal recessive disorder that is characterized by progressive corneal opacity that starts in early childhood and ultimately progresses to blindness in early adulthood. The aim of this study was to identify the cause of MCD in a black South African family with two affected sisters.
Methods: A multigenerational South African Sotho-speaking family with type I MCD was studied using whole exome sequencing.
MicroRNAs are short non-coding RNAs that regulate gene expression at the post-transcriptional level and play key roles in heart development and cardiovascular diseases. Here, we have characterized the expression and distribution of microRNAs across eight cardiac structures (left and right ventricles, apex, papillary muscle, septum, left and right atrium and valves) in rat, Beagle dog and cynomolgus monkey using microRNA sequencing. Conserved microRNA signatures enriched in specific heart structures across these species were identified for cardiac valve (miR-let-7c, miR-125b, miR-127, miR-199a-3p, miR-204, miR-320, miR-99b, miR-328 and miR-744) and myocardium (miR-1, miR-133b, miR-133a, miR-208b, miR-30e, miR-499-5p, miR-30e*).
View Article and Find Full Text PDFBMC Res Notes
January 2012
Background: With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence.
View Article and Find Full Text PDFNucleic Acids Res
August 2011
Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features.
View Article and Find Full Text PDFWe systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs.
View Article and Find Full Text PDFChromatin immunoprecipitation identifies specific interactions between genomic DNA and proteins, advancing our understanding of gene-level and chromosome-level regulation. Based on chromatin immunoprecipitation experiments using validated antibodies, we define the genome-wide distributions of 19 histone modifications, one histone variant, and eight chromatin-associated proteins in Caenorhabditis elegans embryos and L3 larvae. Cluster analysis identified five groups of chromatin marks with shared features: Two groups correlate with gene repression, two with gene activation, and one with the X chromosome.
View Article and Find Full Text PDFNat Struct Mol Biol
January 2011
This paper introduces DANGLE, a new algorithm that employs Bayesian inference to estimate the likelihood of all possible values of the backbone dihedral angles phi and psi for each residue in a query protein, based on observed chemical shifts and the conformational preferences of each amino acid type. The method provides robust estimates of phi and psi within realistic boundary ranges, an indication of the degeneracy in the relationship between shift measurements and conformation at each site, and faithful secondary structure state assignments. When a simple degeneracy-based filtering procedure is applied, DANGLE offers an ideal compromise between accuracy and coverage when compared with other shift-based dihedral angle prediction methods.
View Article and Find Full Text PDF