Background: Identity-by-descent (IBD), which describes recent genetic co-ancestry between pairs of genomes, is a fundamental concept in population genomics. It has been used to estimate genetic relatedness, detect selection signals, and understand population demography. The IBD detection method demonstrates high accuracy in inferring IBD segments between haploid genomes, including , and is widely used in malaria genomic surveillance.
View Article and Find Full Text PDFMajor advances in sequencing approaches, bioinformatic pipelines, and data analysis tools have provided valuable insights into malaria epidemiology from parasite genomic data. However, translating genetic data into actionable information for decision-makers remains a challenge. Significant barriers limit the integration of these advances into a functional data analysis ecosystem that produces standardized, interpretable results for use by national malaria control programs.
View Article and Find Full Text PDFTransmission reconstruction-the inference of who infects whom in disease outbreaks-offers critical insights into how pathogens spread and provides opportunities for targeted control measures. We developed JUNIPER (Joint Underlying Network Inference for Phylogenetic and Epidemiological Reconstructions), a highly-scalable pathogen outbreak reconstruction tool that incorporates intrahost variation, incomplete sampling, and algorithmic parallelization. Central to JUNIPER is a statistical model for within-host variant frequencies observed by next generation sequencing, which we validated on a dataset of over 160,000 deep-sequenced SARS-CoV-2 genomes.
View Article and Find Full Text PDFTransmission reconstruction--the inference of who infects whom in disease outbreaks--offers critical insights into how pathogens spread and provides opportunities for targeted control measures. We developed JUNIPER (Joint Underlying Network Inference for Phylogenetic and Epidemiological Reconstructions), a highly-scalable pathogen outbreak reconstruction tool that incorporates intrahost variation, incomplete sampling, and algorithmic parallelization. Central to JUNIPER is a statistical model for within-host variant frequencies observed by next generation sequencing, which we validated on a dataset of over 160,000 deep-sequenced SARS-CoV-2 genomes.
View Article and Find Full Text PDFBackground: In the United States, () is the principal etiologic agent of Lyme disease. The complex structure of genomes has posed challenges for genomic studies because homology among the bacterium's many plasmids, which account for ~40% of the genome by length, has made them difficult to sequence and assemble.
Results: We used long-read sequencing to generate near-complete assemblies of 62 isolates of human-derived and collected public genomes with plasmid sequences.
Summary: In viral genomic research and surveillance, inter-sample contamination can affect variant detection, analysis of within-host evolution, outbreak reconstruction, and detection of superinfections and recombination events. While sample barcoding methods exist to track inter-sample contamination, they are not always used and can only detect contamination in the experimental pipeline from the point they are added. The underlying genomic information in a sample, however, carries information about inter-sample contamination occurring at any stage.
View Article and Find Full Text PDFPathogen genomics is a powerful tool for tracking infectious disease transmission. In malaria, identity-by-descent is used to assess the genetic relatedness between parasites and has been used to study transmission and importation. In theory, identity-by-descent can be used to distinguish genealogical relationships to reconstruct transmission history or identify parasites for QTL experiments.
View Article and Find Full Text PDFBackground: Drug resistance in Plasmodium falciparum is a major threat to malaria control efforts. Pathogen genomic surveillance could be invaluable for monitoring current and emerging parasite drug resistance.
Methods: Data from two decades (2000-2020) of continuous molecular surveillance of P.
Infection with Lassa virus (LASV) can cause Lassa fever, a haemorrhagic illness with an estimated fatality rate of 29.7%, but causes no or mild symptoms in many individuals. Here, to investigate whether human genetic variation underlies the heterogeneity of LASV infection, we carried out genome-wide association studies (GWAS) as well as seroprevalence surveys, human leukocyte antigen typing and high-throughput variant functional characterization assays.
View Article and Find Full Text PDFThe worldwide decline in malaria incidence is revealing the extensive burden of non-malarial febrile illness (NMFI), which remains poorly understood and difficult to diagnose. To characterize NMFI in Senegal, we collected venous blood and clinical metadata in a cross-sectional study of febrile patients and healthy controls in a low malaria burden area. Using 16S and untargeted sequencing, we detected viral, bacterial, or eukaryotic pathogens in 23% (38/163) of NMFI cases.
View Article and Find Full Text PDFBackground: The only licensed malaria vaccine, RTS,S/AS01 , confers moderate protection against symptomatic disease. Because many malaria infections are asymptomatic, we conducted a large-scale longitudinal parasite genotyping study of samples from a clinical trial exploring how vaccine dosing regimen affects vaccine efficacy (VE).
Methods: 1,500 children aged 5-17 months were randomized to receive four different RTS,S/AS01 regimens or a rabies control vaccine in a phase 2b clinical trial in Ghana and Kenya.
Genetic surveillance of the parasite shows great promise for helping National Malaria Control Programs (NMCPs) assess parasite transmission. Genetic metrics such as the frequency of polygenomic (multiple strain) infections, genetic clones, and the complexity of infection (COI, number of strains per infection) are correlated with transmission intensity. However, despite these correlations, it is unclear whether genetic metrics alone are sufficient to estimate clinical incidence.
View Article and Find Full Text PDFWe here analyze data from the first year of an ongoing nationwide program of genetic surveillance of Plasmodium falciparum parasites in Senegal. The analysis is based on 1097 samples collected at health facilities during passive malaria case detection in 2019; it provides a baseline for analyzing parasite genetic metrics as they vary over time and geographic space. The study's goal was to identify genetic metrics that were informative about transmission intensity and other aspects of transmission dynamics, focusing on measures of genetic relatedness between parasites.
View Article and Find Full Text PDFGenome sequencing can offer critical insight into pathogen spread in viral outbreaks, but existing transmission inference methods use simplistic evolutionary models and only incorporate a portion of available genetic data. Here, we develop a robust evolutionary model for transmission reconstruction that tracks the genetic composition of within-host viral populations over time and the lineages transmitted between hosts. We confirm that our model reliably describes within-host variant frequencies in a dataset of 134,682 SARS-CoV-2 deep-sequenced genomes from Massachusetts, USA.
View Article and Find Full Text PDFEffective infectious disease surveillance in high-risk regions is critical for clinical care and pandemic preemption; however, few clinical diagnostics are available for the wide range of potential human pathogens. Here, we conduct unbiased metagenomic sequencing of 593 samples from febrile Nigerian patients collected in three settings: i) population-level surveillance of individuals presenting with symptoms consistent with Lassa Fever (LF); ii) real-time investigations of outbreaks with suspected infectious etiologies; and iii) undiagnosed clinically challenging cases. We identify 13 distinct viruses, including the second and third documented cases of human blood-associated dicistrovirus, and a highly divergent, unclassified dicistrovirus that we name human blood-associated dicistrovirus 2.
View Article and Find Full Text PDFSARS-CoV-2 distribution and circulation dynamics are not well understood due to challenges in assessing genomic data from tissue samples. We develop experimental and computational workflows for high-depth viral sequencing and high-resolution genomic analyses from formalin-fixed, paraffin-embedded tissues and apply them to 120 specimens from six subjects with fatal COVID-19. To varying degrees, viral RNA is present in extrapulmonary tissues from all subjects.
View Article and Find Full Text PDFThe US experienced an early and severe respiratory syncytial virus (RSV) surge in autumn 2022. Despite the pressure this has put on hospitals and care centers, the factors promoting the surge in cases are unknown. To investigate whether viral characteristics contributed to the extent or severity of the surge, we sequenced 105 RSV-positive specimens from symptomatic patients diagnosed with RSV who presented to the Massachusetts General Hospital (MGH) and its outpatient practices in the Greater Boston Area.
View Article and Find Full Text PDFBackground: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) reinfection is poorly understood, partly because few studies have systematically applied genomic analysis to distinguish reinfection from persistent RNA detection related to initial infection. We aimed to evaluate the characteristics of SARS-CoV-2 reinfection and persistent RNA detection using independent genomic, clinical, and laboratory assessments.
Methods: All individuals at a large academic medical center who underwent a SARS-CoV-2 nucleic acid amplification test (NAAT) ≥45 days after an initial positive test, with both tests between 14 March and 30 December 2020, were analyzed for potential reinfection.
Background: Universities are vulnerable to infectious disease outbreaks, making them ideal environments to study transmission dynamics and evaluate mitigation and surveillance measures. Here, we analyze multimodal COVID-19-associated data collected during the 2020-2021 academic year at Colorado Mesa University and introduce a SARS-CoV-2 surveillance and response framework.
Methods: We analyzed epidemiological and sociobehavioral data (demographics, contact tracing, and WiFi-based co-location data) alongside pathogen surveillance data (wastewater and diagnostic testing, and viral genomic sequencing of wastewater and clinical specimens) to characterize outbreak dynamics and inform policy.