JMIR Public Health Surveill
April 2024
Background: The decline in global child mortality is an important public health achievement, yet child mortality remains disproportionally high in many low-income countries like Guinea-Bissau. The persisting high mortality rates necessitate targeted research to identify vulnerable subgroups of children and formulate effective interventions.
Objective: This study aimed to discover subgroups of children at an elevated risk of mortality in the urban setting of Bissau, Guinea-Bissau, West Africa.
Nearly all diseases are caused by different combinations of exposures. Yet, most epidemiological studies focus on estimating the effect of a single exposure on a health outcome. We present the Causes of Outcome Learning approach (CoOL), which seeks to discover combinations of exposures that lead to an increased risk of a specific outcome in parts of the population.
View Article and Find Full Text PDFIdentification of individuals at risk of developing disease comorbidities represents an important task in tackling the growing personal and societal burdens associated with chronic diseases. We employed machine learning techniques to investigate to what extent data from longitudinal, nationwide Danish health registers can be used to predict individuals at high risk of developing type 2 diabetes (T2D) comorbidities. Leveraging logistic regression-, random forest- and gradient boosting models and register data spanning hospitalizations, drug prescriptions and contacts with primary care contractors from >200,000 individuals newly diagnosed with T2D, we predicted five-year risk of heart failure (HF), myocardial infarction (MI), stroke (ST), cardiovascular disease (CVD) and chronic kidney disease (CKD).
View Article and Find Full Text PDFWe established a catalog of the mouse gut metagenome comprising ∼2.6 million nonredundant genes by sequencing DNA from fecal samples of 184 mice. To secure high microbiome diversity, we used mouse strains of diverse genetic backgrounds, from different providers, kept in different housing laboratories and fed either a low-fat or high-fat diet.
View Article and Find Full Text PDFBuilding a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates.
View Article and Find Full Text PDFMost current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences.
View Article and Find Full Text PDFBackground: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors.
View Article and Find Full Text PDFNucleic Acids Res
July 2013
MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein-protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.
View Article and Find Full Text PDFCongenital hypogonadotropic hypogonadism (CHH) and its anosmia-associated form (Kallmann syndrome [KS]) are genetically heterogeneous. Among the >15 genes implicated in these conditions, mutations in FGF8 and FGFR1 account for ~12% of cases; notably, KAL1 and HS6ST1 are also involved in FGFR1 signaling and can be mutated in CHH. We therefore hypothesized that mutations in genes encoding a broader range of modulators of the FGFR1 pathway might contribute to the genetics of CHH as causal or modifier mutations.
View Article and Find Full Text PDFJ Struct Funct Genomics
March 2012
Phosphoglycerate kinase (PGK) is indispensable during glycolysis for anaerobic glucose degradation and energy generation. Here we present comprehensive structure analysis of two putative PGKs from Bacillus anthracis str. Sterne and Campylobacter jejuni in the context of their structural homologs.
View Article and Find Full Text PDFCirc Cardiovasc Genet
October 2011
Background: Network-based approaches may leverage genome-wide association (GWA) analysis by testing for the aggregate association across several pathway members. We aimed to examine if networks of genes that represent experimentally determined protein-protein interactions (PPIs) are enriched in genes associated with risk of coronary heart disease (CHD).
Methods And Results: Genome-wide association analyses of approximately ≈700,000 single-nucleotide polymorphisms in 899 incident CHD cases and 1823 age- and sex-matched controls within the Nurses' Health and the Health Professionals Follow-up Studies were used to assign genewise P values.
Meta-analyses of large-scale association studies typically proceed solely within one data type and do not exploit the potential complementarities in other sources of molecular evidence. Here, we present an approach to combine heterogeneous data from genome-wide association (GWA) studies, protein-protein interaction screens, disease similarity, linkage studies, and gene expression experiments into a multi-layered evidence network which is used to prioritize the entire protein-coding part of the genome identifying a shortlist of candidate genes. We report specifically results on bipolar disorder, a genetically complex disease where GWA studies have only been moderately successful.
View Article and Find Full Text PDF