Comput Struct Biotechnol J
February 2023
Current single-cell visualisation techniques project high dimensional data into 'map' views to identify high-level structures such as cell clusters and trajectories. New tools are needed to allow the transversal through the high dimensionality of single-cell data to explore the single-cell local neighbourhood. StarmapVis is a convenient web application displaying an interactive downstream analysis of single-cell expression or spatial transcriptomic data.
View Article and Find Full Text PDFHuman induced pluripotent stem cell (iPSC) lines are a powerful tool for studying development and disease, but the considerable phenotypic variation between lines makes it challenging to replicate key findings and integrate data across research groups. To address this issue, we sub-cloned candidate human iPSC lines and deeply characterized their genetic properties using whole genome sequencing, their genomic stability upon CRISPR-Cas9-based gene editing, and their phenotypic properties including differentiation to commonly used cell types. These studies identified KOLF2.
View Article and Find Full Text PDFTransposable elements (TEs) regulate diverse biological processes, from early development to cancer. Expression of young TEs is difficult to measure with next-generation, single-cell sequencing technologies because their highly repetitive nature means that short complementary DNA reads cannot be unambiguously mapped to a specific locus. Single CELl LOng-read RNA-sequencing (CELLO-seq) combines long-read single cell RNA-sequencing with computational analyses to measure TE expression at unique loci.
View Article and Find Full Text PDFMitochondrial energy production and function rely on optimal concentrations of the essential redox-active lipid, coenzyme Q (CoQ). CoQ deficiency results in mitochondrial dysfunction associated with increased mitochondrial oxidative stress and a range of pathologies. What drives CoQ deficiency in many of these pathologies is unknown, just as there currently is no effective therapeutic strategy to overcome CoQ deficiency in humans.
View Article and Find Full Text PDFBMC Genomics
December 2019
Background: Read alignment and transcript assembly are the core of RNA-seq analysis for transcript isoform discovery. Nonetheless, current tools are not designed to be scalable for analysis of full-length bulk or single cell RNA-seq (scRNA-seq) data. The previous version of our cloud-based tool Falco only focuses on RNA-seq read counting, but does not allow for more flexible steps such as alignment and read assembly.
View Article and Find Full Text PDFRead alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads.
View Article and Find Full Text PDFWe have previously reported a subpopulation of mesenchymal stromal cells (MSCs) within the platelet-derived growth factor receptor-alpha (PDGFRα)/CD90 co-expressing cardiac interstitial and adventitial cell fraction. Here we further characterise PDGFRα/CD90-expressing cardiac MSCs (PDGFRα + cMSCs) and use human telomerase reverse transcriptase (hTERT) over-expression to increase cMSCs ability to repair the heart after induced myocardial infarction. hTERT over-expression in PDGFRα + cardiac MSCs (hTERT + PDGFRα + cMSCs) modulates cell differentiation, proliferation, survival and angiogenesis related genes.
View Article and Find Full Text PDFNucleic Acids Res
January 2018
Although successful in identifying new cataract-linked genes, the previous version of the database iSyTE (integrated Systems Tool for Eye gene discovery) was based on expression information on just three mouse lens stages and was functionally limited to visualization by only UCSC-Genome Browser tracks. To increase its efficacy, here we provide an enhanced iSyTE version 2.0 (URL: http://research.
View Article and Find Full Text PDFComput Struct Biotechnol J
July 2017
This review examines two important aspects that are central to modern big data bioinformatics analysis - software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload.
View Article and Find Full Text PDFMotivation: DNA binding proteins such as chromatin remodellers, transcription factors (TFs), histone modifiers and co-factors often bind cooperatively to activate or repress their target genes in a cell type-specific manner. Nonetheless, the precise role of cooperative binding in defining cell-type identity is still largely uncharacterized.
Results: Here, we collected and analyzed 214 public datasets representing chromatin immunoprecipitation followed by sequencing (ChIP-Seq) of 104 DNA binding proteins in embryonic stem cell (ESC) lines.
Nucleic Acids Res
May 2017
Bioinformatics
March 2017
The pace of disease gene discovery is still much slower than expected, even with the use of cost-effective DNA sequencing and genotyping technologies. It is increasingly clear that many inherited heart diseases have a more complex polygenic aetiology than previously thought. Understanding the role of gene-gene interactions, epigenetics, and non-coding regulatory regions is becoming increasingly critical in predicting the functional consequences of genetic mutations identified by genome-wide association studies and whole-genome or exome sequencing.
View Article and Find Full Text PDFGene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles.
View Article and Find Full Text PDF