Publications by authors named "Michael D Edge"

As genetic sequencing costs have plummeted, datasets with sizes previously unthinkable have begun to appear. Such datasets present opportunities to learn about evolutionary history, particularly via rare alleles that record the very recent past. However, beyond the computational challenges inherent in the analysis of many large-scale datasets, large population-genetic datasets present theoretical problems.

View Article and Find Full Text PDF

Advances in sequencing technology are allowing forensic scientists to access genetic information from increasingly challenging samples. A recently published computational approach, IBDGem, analyzes sequencing reads, including from low-coverage samples, in order to arrive at likelihood ratios for human identification. Here, we show that likelihood ratios produced by IBDGem are best interpreted as testing a null hypothesis different from the traditional one used in a forensic genetics context.

View Article and Find Full Text PDF

Case-control genome-wide association studies (GWAS) are often used to find associations between genetic variants and diseases. When case-control GWAS are conducted, researchers must make decisions regarding how many cases and how many controls to include in the study. Connections between variants and diseases are made using association statistics, including χ.

View Article and Find Full Text PDF

The demographic history of a population underlies patterns of genetic variation and is encoded in the gene-genealogical trees of the sampled haplotypes. Here we propose a demographic inference framework called the genealogical likelihood (gLike). Our method uses a graph-based structure to summarize the relationships among all lineages in a gene-genealogical tree with all possible trajectories of population memberships through time and derives the full likelihood across trees under a parameterized demographic model.

View Article and Find Full Text PDF

Scalable methods for estimating marginal coalescent trees across the genome present new opportunities for studying evolution and have generated considerable excitement, with new methods extending scalability to thousands of samples. Benchmarking of the available methods has revealed general tradeoffs between accuracy and scalability, but performance in downstream applications has not always been easily predictable from general performance measures, suggesting that specific features of the ancestral recombination graph (ARG) may be important for specific downstream applications of estimated ARGs. To exemplify this point, we benchmark ARG estimation methods with respect to a specific set of methods for estimating the historical time course of a population-mean polygenic score (PGS) using the marginal coalescent trees encoded by the ARG.

View Article and Find Full Text PDF

Genetic and phenotypic variation among populations is one of the fundamental subjects of evolutionary genetics. One question that arises often in data on natural populations is whether differentiation among populations on a particular trait might be caused in part by natural selection. For the past several decades, researchers have used QST-FST approaches to compare the amount of trait differentiation among populations on one or more traits (measured by the statistic QST) with differentiation on genome-wide genetic variants (measured by FST).

View Article and Find Full Text PDF

Polygenic scores (PGSs) are being rapidly adopted for trait prediction in the clinic and beyond. PGSs are often thought of as capturing the direct genetic effect of one's genotype on their phenotype. However, because PGSs are constructed from population-level associations, they are influenced by factors other than direct genetic effects, including stratification, assortative mating, and dynastic effects ("SAD effects").

View Article and Find Full Text PDF

Case-control genome-wide association studies (GWAS) are often used to find associations between genetic variants and diseases. When case-control GWAS are conducted, researchers must make decisions regarding how many cases and how many controls to include in the study. Depending on differing availability and cost of controls and cases, varying case fractions are used in case-control GWAS.

View Article and Find Full Text PDF

Microbes of nearly every species can form biofilms, communities of cells bound together by a self-produced matrix. It is not understood how variation at the cellular level impacts putatively beneficial, colony-level behaviors, such as cell-to-cell signaling. Here we investigate this problem with an agent-based computational model of metabolically driven electrochemical signaling in biofilms.

View Article and Find Full Text PDF

Genetic and phenotypic variation among populations is one of the fundamental subjects of evolutionary genetics. One question that arises often in data on natural populations is whether differentiation among populations on a particular trait might be caused in part by natural selection. For the past several decades, researchers have used approaches to compare the amount of trait differentiation among populations on one or more traits (measured by the statistic ) with differentiation on genome-wide genetic variants (measured by ).

View Article and Find Full Text PDF
Article Synopsis
  • The goal in both statistical genetics and phylogenetics is to uncover relationships between genetic factors and traits, but their statistical methods differ significantly.
  • The increasing overlap in research areas like medicine and biology necessitates a unified approach, as traditional boundaries between these two fields become less clear.
  • By introducing a general covariance model, the authors illustrate that existing methods can be harmonized, allowing for shared techniques to improve research accuracy and mitigate misleading correlations in both genetics and evolutionary studies.
View Article and Find Full Text PDF

As genetic sequencing costs have plummeted, datasets with sizes previously un-thinkable have begun to appear. Such datasets present new opportunities to learn about evolutionary history, particularly via rare alleles that record the very recent past. However, beyond the computational challenges inherent in the analysis of many large-scale datasets, large population-genetic datasets present theoretical problems.

View Article and Find Full Text PDF

Advances in sequencing technology are allowing forensic scientists to access genetic information from increasingly challenging samples. A recently published computational approach, IBDGem, analyzes sequencing reads, including from low-coverage samples, in order to arrive at likelihood ratios for human identification. Here, we show that likelihood ratios produced by IBDGem are best interpreted as testing a null hypothesis different from the traditional one used in a forensic genetics context.

View Article and Find Full Text PDF

Scalable methods for estimating marginal coalescent trees across the genome present new opportunities for studying evolution and have generated considerable excitement, with new methods extending scalability to thousands of samples. Benchmarking of the available methods has revealed general tradeoffs between accuracy and scalability, but performance in downstream applications has not always been easily predictable from general performance measures, suggesting that specific features of the ARG may be important for specific downstream applications of estimated ARGs. To exemplify this point, we benchmark ARG estimation methods with respect to a specific set of methods for estimating the historical time course of a population-mean polygenic score (PGS) using the marginal coalescent trees encoded by the ancestral recombination graph (ARG).

View Article and Find Full Text PDF

In both statistical genetics and phylogenetics, a major goal is to identify correlations between genetic loci or other aspects of the phenotype or environment and a focal trait. In these two fields, there are sophisticated but disparate statistical traditions aimed at these tasks. The disconnect between their respective approaches is becoming untenable as questions in medicine, conservation biology, and evolutionary biology increasingly rely on integrating data from within and among species, and once-clear conceptual divisions are becoming increasingly blurred.

View Article and Find Full Text PDF
Article Synopsis
  • It’s challenging to distinguish whether phenotypic differences between groups arise from genetic or environmental factors without controlled experiments.
  • Some researchers argue this issue mainly arises in extreme cases and propose methods that relate heritable variation within groups to that among groups.
  • The authors review three approaches—between-group heritability, a specific statistic from evolutionary genetics, and ancestry variation methods—demonstrating mathematically that within-group data cannot adequately separate the genetic and environmental causes of differences between groups, supporting their argument with simulation results.
View Article and Find Full Text PDF

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide association studies (GWASs) are a powerful way to find genetic loci associated with phenotypes. GWASs are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history.

View Article and Find Full Text PDF

Without the ability to control or randomize environments (or genotypes), it is difficult to determine the degree to which observed phenotypic differences between two groups of individuals are due to genetic vs. environmental differences. However, some have suggested that these concerns may be limited to pathological cases, and methods have appeared that seem to give-directly or indirectly-some support to claims that aggregate heritable variation within groups can be related to heritable variation among groups.

View Article and Find Full Text PDF

The demographic history of a population drives the pattern of genetic variation and is encoded in the gene-genealogical trees of the sampled alleles. However, existing methods to infer demographic history from genetic data tend to use relatively low-dimensional summaries of the genealogy, such as allele frequency spectra. As a step toward capturing more of the information encoded in the genome-wide sequence of genealogical trees, here we propose a novel framework called the genealogical likelihood (gLike), which derives the full likelihood of a genealogical tree under any hypothesized demographic history.

View Article and Find Full Text PDF

The 20 short tandem repeat (STR) loci of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS loci are thought to contain little information about ancestry or traits. However, in the past 20 years, a growing field has identified hundreds of thousands of genotype-trait associations.

View Article and Find Full Text PDF

Sex differences in complex traits are suspected to be in part due to widespread gene-by-sex interactions (GxSex), but empirical evidence has been elusive. Here, we infer the mixture of ways in which polygenic effects on physiological traits covary between males and females. We find that GxSex is pervasive but acts primarily through systematic sex differences in the magnitude of many genetic effects ("amplification") rather than in the identity of causal variants.

View Article and Find Full Text PDF

Understanding the genetic basis of complex phenotypes is a central pursuit of genetics. Genome-wide Association Studies (GWAS) are a powerful way to find genetic loci associated with phenotypes. GWAS are widely and successfully used, but they face challenges related to the fact that variants are tested for association with a phenotype independently, whereas in reality variants at different sites are correlated because of their shared evolutionary history.

View Article and Find Full Text PDF

The 20 short tandem repeat (STR) markers of the combined DNA index system (CODIS) are the basis of the vast majority of forensic genetics in the United States. One argument for permissive rules about the collection of CODIS genotypes is that the CODIS markers are thought to contain information relevant to identification only (such as a human fingerprint would), with little information about ancestry or traits. However, in the past 20 years, a quickly growing field has identified hundreds of thousands of genotype-trait associations.

View Article and Find Full Text PDF

The 1997 film Gattaca has emerged as a canonical pop culture reference used to discuss modern controversies in genetics and bioethics. It appeared in theaters a few years prior to the announcement of the "completion" of the human genome (2000), as the science of human genetics was developing a renewed sense of its social implications. The story is set in a near-future world in which parents can, with technological assistance, influence the genetic composition of their offspring on the basis of predicted life outcomes.

View Article and Find Full Text PDF