Over the past three decades, molecular epidemiological studies have provided new opportunities to investigate the transmission dynamics of . In most studies, a sizable fraction of individuals with notified tuberculosis cannot be included, either because they do not have culture-positive disease (and thus do not have specimens available for molecular typing) or because resources for conducting sequencing are limited. A recent study introduced a regression-based approach for inferring the membership of unsequenced tuberculosis cases in transmission clusters based on host demographic and epidemiological data.
View Article and Find Full Text PDFTuberculosis remains a leading cause of infection-related mortality, and efforts to reduce its incidence have been hindered by an incomplete understanding of local Mycobacterium tuberculosis transmission dynamics. Advances in pathogen sequencing and spatial analysis have created new opportunities to map M tuberculosis transmission patterns more precisely. In this scoping review, we searched for studies combining pathogen genetics and location data to analyse the spatial patterns of M tuberculosis transmission and identified 142 studies published between 1994 and 2024.
View Article and Find Full Text PDFDelays in identifying and treating individuals with infectious tuberculosis (TB) contribute to poor health outcomes and allow ongoing community transmission of M. tuberculosis (Mtb). Current recommendations for screening for tuberculosis specify community characteristics (e.
View Article and Find Full Text PDFBackground: Mixed infection with multiple strains of the same pathogen in a single host can present clinical and analytical challenges. Whole genome sequence (WGS) data can identify signals of multiple strains in samples, though the precision of previous methods can be improved. Here, we present MixInfect2, a new tool to accurately detect mixed samples from Mycobacterium tuberculosis short-read WGS data.
View Article and Find Full Text PDFPathogen sequencing is an important tool for disease surveillance and demonstrated its high value during the COVID-19 pandemic. Viral sequencing during the pandemic allowed us to track disease spread, quickly identify new variants, and guide the development of vaccines. Tiled amplicon sequencing, in which a panel of primers is used for multiplex amplification of fragments across an entire genome, was the cornerstone of SARS-CoV-2 sequencing.
View Article and Find Full Text PDFGraph structures are often used to visualize transmission networks generated using genomic epidemiological methods. However, tools to interactively visualize these graphs do not exist. A browser-based tool allowing users to load and interactively visualize transmission graphs was developed in JavaScript.
View Article and Find Full Text PDFBackground: Mycobacterium tuberculosis complex (MTBC) species evolve slowly, so isolates from individuals linked in transmission often have identical or nearly identical genomes, making it difficult to reconstruct transmission chains. Finding additional sources of shared MTBC variation could help overcome this problem. Previous studies have reported MTBC diversity within infected individuals; however, whether within-host variation improves transmission inferences remains unclear.
View Article and Find Full Text PDFIdentifying individuals with tuberculosis (TB) with a high risk of onward transmission can guide disease prevention and public health strategies. Here, we train classification models to predict the first sampled isolates in Mycobacterium tuberculosis transmission clusters from demographic and disease data. We find that supervised learning, in particular balanced random forests, can be used to develop predictive models to identify people with TB that are more likely associated with TB cluster growth, with good model performance and AUCs of ≥ 0.
View Article and Find Full Text PDFInfectious disease dynamics are driven by the complex interplay of epidemiological, ecological, and evolutionary processes. Accurately modeling these interactions is crucial for understanding pathogen spread and informing public health strategies. However, existing simulators often fail to capture the dynamic interplay between these processes, resulting in oversimplified models that do not fully reflect real-world complexities in which the pathogen's genetic evolution dynamically influences disease transmission.
View Article and Find Full Text PDFThe projected trajectory of multidrug resistant tuberculosis (MDR-TB) epidemics depends on the reproductive fitness of circulating strains of MDR M. tuberculosis (Mtb). Previous efforts to characterize the fitness of MDR Mtb have found that Mtb strains of the Beijing sublineage (Lineage 2.
View Article and Find Full Text PDFBackground: Multidrug resistant tuberculosis (MDR-TB) represents a major public health concern in the Republic of Moldova, with an estimated 31% of new and 56% of previously treated TB cases having MDR disease in 2022. A recent genomic epidemiology study of incident TB occurring in 2018 and 2019 found that 92% of MDR-TB was the result of transmission. The MDR phenotype was concentrated among two M.
View Article and Find Full Text PDFBackground: Because evolves slowly, transmission clusters often contain multiple individuals with identical consensus genomes, making it difficult to reconstruct transmission chains. Finding additional sources of shared variation could help overcome this problem. Previous studies have reported diversity within infected individuals; however, whether within-host variation improves transmission inferences remains unclear.
View Article and Find Full Text PDFThe seasonal influenza (flu) vaccine is designed to protect against those influenza viruses predicted to circulate during the upcoming flu season, but identifying which viruses are likely to circulate is challenging. We use features from phylogenetic trees reconstructed from hemagglutinin (HA) and neuraminidase (NA) sequences, together with a support vector machine, to predict future circulation. We obtain accuracies of 0.
View Article and Find Full Text PDFSerial intervals - the time between symptom onset in infector and infectee - are a fundamental quantity in infectious disease control. However, their estimation requires knowledge of individuals' exposures, typically obtained through resource-intensive contact tracing efforts. We introduce an alternate framework using virus sequences to inform who infected whom and thereby estimate serial intervals.
View Article and Find Full Text PDFObjectives: Clustering pathogen sequence data is a common practice in epidemiology to gain insights into the genetic diversity and evolutionary relationships among pathogens. We can find groups of cases with a shared transmission history and common origin, as well as identifying transmission hotspots. Motivated by the experience of clustering SARS-CoV-2 cases using whole genome sequence data during the COVID-19 pandemic to aid with public health investigation, we investigated how differences in epidemiology and sampling can influence the composition of clusters that are identified.
View Article and Find Full Text PDFEpidemiol Infect
June 2023
Genomic epidemiology is routinely used worldwide to interrogate infectious disease dynamics. Multiple computational tools exist that reconstruct transmission networks by coupling genomic data with epidemiological models. Resulting inferences can improve our understanding of pathogen transmission dynamics, and yet the performance of these tools has not been evaluated for tuberculosis (TB), a disease process with complex epidemiology including variable latency and within-host heterogeneity.
View Article and Find Full Text PDFUnderstanding factors that contribute to the increased likelihood of pathogen transmission between two individuals is important for infection control. However, analyzing measures of pathogen relatedness to estimate these associations is complicated due to correlation arising from the presence of the same individual across multiple dyadic outcomes, potential spatial correlation caused by unmeasured transmission dynamics, and the distinctive distributional characteristics of some of the outcomes. We develop two novel hierarchical Bayesian spatial methods for analyzing dyadic pathogen genetic relatedness data, in the form of patristic distances and transmission probabilities, that simultaneously address each of these complications.
View Article and Find Full Text PDFJ Comput Biol
February 2023
Genome-wide association studies (GWASs) are often confounded by population stratification and structure. Linear mixed models (LMMs) are a powerful class of methods for uncovering genetic effects, while controlling for such confounding. LMMs include random effects for a genetic similarity matrix, and they assume that a true genetic similarity matrix is known.
View Article and Find Full Text PDFBackground: The COVID-19 pandemic remains a global public health concern. Advances in sequencing technologies has allowed for high numbers of SARS-CoV-2 whole genome sequence (WGS) data and rapid sharing of sequences through global repositories to enable almost real-time genomic analysis of the pathogen. WGS data has been used previously to group genetically similar viral pathogens to reveal evidence of transmission, including methods that identify distinct clusters on a phylogenetic tree.
View Article and Find Full Text PDFCombined with epidemiological data, whole-genome sequencing (WGS) can help better resolve individual tuberculosis (TB) transmission events to a degree not possible with traditional genotyping. We combine WGS data with patient-level data to calculate the timing of secondary TB among contacts of people diagnosed with active TB in British Columbia, Canada.
View Article and Find Full Text PDFUnderstanding host and pathogen factors that influence tuberculosis (TB) transmission can inform strategies to eliminate the spread of ). Determining transmission links between cases of TB is complicated by a long and variable latency period and undiagnosed cases, although methods are improving through the application of probabilistic modelling and whole-genome sequence analysis. Using a large dataset of 1857 whole-genome sequences and comprehensive metadata from Karonga District, Malawi, over 19 years, we reconstructed transmission networks using a two-step Bayesian approach that identified likely infector and recipient cases, whilst robustly allowing for incomplete case sampling.
View Article and Find Full Text PDF