Publications by David Knowles | LitMetric

Publications by authors named "David Knowles"

Page 1 of 5

Leveraging functional annotations to map rare variants associated with Alzheimer disease with gruyere.

Anjali Das , Chirag Lakhani , Chloé Terwagne , Jui-Shan T Lin , Tatsuhiko Naito , David A Knowles

Am J Hum Genet

September 2025

Increased availability of whole-genome sequencing (WGS) has facilitated the study of rare variants (RVs) in complex diseases. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most do not fully leverage the availability of variant-level functional annotations. We propose genome-wide rare variant enrichment evaluation (gruyere), an empirical Bayesian framework that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization.

View Article and Find Full Text PDF

Deriving Mendelian Randomization-based Causal Networks of Brain Imaging Phenotypes and Bipolar Disorder.

Shane O'Connell , Brielin C Brown , Dara M Cannon , Pilib Ó Broin , Nadine Parker , David A Knowles

Biol Psychiatry Cogn Neurosci Neuroimaging

August 2025

Background: Neuroanatomical variation in individuals with bipolar disorder (BD) has been previously described in observational studies. However, the causal dynamics of these relationships remain unexplored.

Methods: We performed Mendelian Randomization of 297 structural and functional neuroimaging phenotypes from the UK Biobank and BD using GWAS summary statistics.

View Article and Find Full Text PDF

Cas13d-mediated isoform-specific RNA knockdown with a unified computational and experimental toolbox.

Megan D Schertzer , Andrew Stirn , Keren Isaev , Laura Pereira , Stella H Park , David A Knowles

Nat Commun

July 2025

Pre- and post-transcriptional mechanisms, including alternative promoters, termination signals, and splicing, play essential roles in diversifying protein output by generating distinct RNA and protein isoforms. Two major challenges in characterizing the cellular function of alternative isoforms are the lack of experimental methods to specifically and efficiently modulate isoform expression and computational tools for complex experimental design and analysis. To address these gaps, we develop and methodically test an isoform-specific knockdown strategy which pairs the RNA-targeting CRISPR/Cas13d system with guide RNAs that span exon-exon junctions.

View Article and Find Full Text PDF

A computational framework for mapping isoform landscape and regulatory mechanisms from spatial transcriptomics data.

Jiayu Su , Yiming Qu , Megan Schertzer , Haochen Yang , Jiahao Jiang , David A Knowles

bioRxiv

May 2025

Unlabelled: Transcript diversity including splicing and alternative 3'end usage is crucial for cellular identity and adaptation, yet its spatial coordination remains poorly understood. Here, we present SPLISOSM (SpatiaL ISOform Statistical Modeling), a computational framework for detecting isoform-resolution patterns from spatial transcriptomics data. SPLISOSM leverages multivariate testing to account for spot- and isoform-level dependencies, demonstrating robust and theoretically grounded performance on sparse data.

View Article and Find Full Text PDF

Perplexity as a Metric for Isoform Diversity in the Human Transcriptome.

Megan D Schertzer , Stella H Park , Jiayu Su , Gloria M Sheynkman , David A Knowles

bioRxiv

July 2025

Long-read sequencing (LRS) has revealed a far greater diversity of RNA isoforms than earlier technologies, increasing the critical need to determine which, and how many, isoforms per gene are biologically meaningful. To define the space of relevant isoforms from LRS, many existing analysis pipelines rely on arbitrary expression cutoffs, but a single threshold cannot accommodate the broad variability in isoform complexity across genes, cell-types, and disease states captured by LRS. To address this, we propose using -an interpretable measure derived from entropy-that quantifies the effective number of isoforms per gene based on the full, unfiltered isoform ratio distribution.

View Article and Find Full Text PDF

Ensuring the structural integrity of tokamak fusion power plants: challenges, progress and pathway.

Yiqiang Wang , Giacomo Aiello , Gerald Pintsuk , Dmitry Terentyev , Jeong-Ha You , David Knowles

Philos Trans A Math Phys Eng Sci

July 2025

Structural integrity for fusion is an integrated multi-disciplinary subject spanning the science of materials, technology, engineering, health monitoring and simulation methods and algorithms for scrutinizing the assurance of reliable fusion reactor performance from the whole plant design phase through operation to decommissioning. Structural integrity is essential for maintaining high standards of public, environmental and investment protection and maximizing economic benefits. While fusion shares many of the structural integrity challenges faced by other industries, it also presents unique complexities.

View Article and Find Full Text PDF

Mosaic chromosomal alterations in blood are associated with an increased risk of Alzheimer's disease.

Tatsuhiko Naito , Kosei Hirata , Beomjin Jang , Chirag M Lakhani , Alice Buonfiglioli , David A Knowles

medRxiv

June 2025

Mosaic chromosomal alterations (mCAs) in blood, a form of clonal hematopoiesis, have been linked to various diseases, but their role in Alzheimer's disease (AD) remains unclear. We analyzed blood whole-genome sequencing (WGS) data from 24,049 individuals in the Alzheimer's Disease Sequencing Project and found that autosomal mCAs were significantly associated with increased AD risk (odds ratio = 1.27; = 1.

View Article and Find Full Text PDF

Alternating hemiplegia of childhood associated mutations in Atp1a3 reveal diverse neurological alterations in mice.

Markus Terrey , Georgii Krivoshein , Scott I Adamson , Elena Arystarkhova , Laura Anderson , David A Knowles

Neurobiol Dis

August 2025

Pathogenic variants in the neuronal Na/K ATPase transmembrane ion transporter (ATP1A3) cause a spectrum of neurological disorders including alternating hemiplegia of childhood (AHC). The most common de novo pathogenic variants in AHC are p.D801N (∼40 % of patients) and p.

View Article and Find Full Text PDF

CONVEX APPROACHES TO ISOLATE THE SHARED AND DISTINCT GENETIC STRUCTURES OF SUBPHENOTYPES IN HETEROGENEOUS COMPLEX TRAITS.

Saikat Banerjee , Shane O'Connell , Sarah M C Colbert , Niamh Mullins , David A Knowles

medRxiv

April 2025

Groups of complex diseases, such as coronary heart diseases, neuropsychiatric disorders, and cancers, often display overlapping clinical symptoms and pharmacological treatments. The shared associations of genetic variants across diseases has the potential to explain their underlying biological processes, but this remains poorly understood. To address this, we model the matrix of summary statistics of trait-associated genetic variants as the sum of a low-rank component - representing shared biological processes - and a sparse component, representing unique processes and arbitrarily corrupted or contaminated components.

View Article and Find Full Text PDF

Phenotypic complexities of rare heterozygous neurexin-1 deletions.

Michael B Fernando , Yu Fan , Yanchun Zhang , Alex Tokolyi , Aleta N Murphy , David A Knowles

Nature

June 2025

Given the large number of genes significantly associated with risk for neuropsychiatric disorders, a critical unanswered question is the extent to which diverse mutations-sometimes affecting the same gene-will require tailored therapeutic strategies. Here we consider this in the context of rare neuropsychiatric disorder-associated copy number variants (2p16.3) resulting in heterozygous deletions in NRXN1, which encodes a presynaptic cell-adhesion protein that serves as a critical synaptic organizer in the brain.

View Article and Find Full Text PDF

SingleBrain: A Meta-Analysis of Single-Nucleus eQTLs Linking Genetic Risk to Brain Disorders.

Beomjin Jang , Kailash Bp , Alex Tokolyi , Winston H Cuddleston , Ashvin Ravi , David A Knowles

medRxiv

March 2025

Most genetic risk variants for neurological diseases are located in non-coding regulatory regions, where they may often act as expression quantitative trait loci (eQTLs), modulating gene expression and influencing disease susceptibility. However, eQTL studies in bulk brain tissue or specific cell types lack the resolution to capture the brain's cellular diversity. Single-nucleus RNA sequencing (snRNA-seq) offers high-resolution mapping of eQTLs across diverse brain cell types.

View Article and Find Full Text PDF

Musculoskeletal Computed Tomography: How to Add Value When Reporting Adult Lower Limb Trauma.

Yacer Asran , Thomas Mutungi , Kapil Shirodkar , Ganesh Hegde , Sameer Shamshuddin , David Knowles

J Comput Assist Tomogr

March 2025

Computed tomography plays an ever-increasing role in the management of fractures and dislocations due to its capability in efficiently providing multiplanar reformats and 3-dimensional volume rendered images. It can reveal findings that are occult on plain radiography and therefore allow for more accurate decision making with regard to fracture classification and management. Clinical radiologists play a critical role in facilitating the processing of imaging to provide adequate image reformats in the desired planes, producing 3 dimensional images but most crucially identifying pertinent findings, which will contribute between the selection of nonoperative and operative management and potentially influence surgical technique.

View Article and Find Full Text PDF

Deriving Mendelian Randomization-based Causal Networks of Brain Imaging Phenotypes and Bipolar Disorder.

Shane O'Connell , Brielin C Brown , Dara M Cannon , Pilib Ó Broin , David A Knowles

medRxiv

December 2024

Neuroanatomical variation in individuals with bipolar disorder (BD) has been previously described in observational studies. However, the causal dynamics of these relationships remain unexplored. We performed Mendelian Randomization of 297 structural and functional neuroimaging phenotypes from the UK BioBank and BD using genome-wide association study summary statistics.

View Article and Find Full Text PDF

Leveraging functional annotations to map rare variants associated with Alzheimer's disease with gruyere.

Anjali Das , Chirag Lakhani , Chloé Terwagne , Jui-Shan T Lin , Tatsuhiko Naito , David A Knowles

medRxiv

March 2025

The increasing availability of whole-genome sequencing (WGS) has begun to elucidate the contribution of rare variants (RVs), both coding and non-coding, to complex disease. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most are restricted to per-gene models and do not fully leverage the availability of variant-level functional annotations. We propose Genome-wide Rare Variant EnRichment Evaluation (gruyere), a Bayesian probabilistic model that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization.

View Article and Find Full Text PDF

Vector embeddings by sequence similarity and context for improved compression, similarity search, clustering, organization, and manipulation of cDNA libraries.

Daniel H Um , David A Knowles , Gail E Kaiser

Comput Biol Chem

February 2025

This paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ).

View Article and Find Full Text PDF

Disentangling Interpretable Factors with Supervised Independent Subspace Principal Component Analysis.

Jiayu Su , David A Knowles , Raul Rabadan

ArXiv

October 2024

The success of machine learning models relies heavily on effectively representing high-dimensional data. However, ensuring data representations capture human-understandable concepts remains difficult, often requiring the incorporation of prior knowledge and decomposition of data into multiple subspaces. Traditional linear methods fall short in modeling more than one space, while more expressive deep learning approaches lack interpretability.

View Article and Find Full Text PDF

A Bayesian framework for inferring dynamic intercellular interactions from time-series single-cell data.

Cameron Park , Shouvik Mani , Nicolas Beltran-Velez , Katie Maurer , Teddy Huang , David A Knowles

Genome Res

October 2024

Characterizing cell-cell communication and tracking its variability over time are crucial for understanding the coordination of biological processes mediating normal development, disease progression, and responses to perturbations such as therapies. Existing tools fail to capture time-dependent intercellular interactions and primarily rely on databases compiled from limited contexts. We introduce DIISCO, a Bayesian framework designed to characterize the temporal dynamics of cellular interactions using single-cell RNA-sequencing data from multiple time points.

View Article and Find Full Text PDF

Smoother: a unified and modular framework for incorporating structural dependency in spatial omics data.

Jiayu Su , Jean-Baptiste Reynier , Xi Fu , Guojie Zhong , Jiahao Jiang , David A Knowles

Genome Biol

December 2023

Spatial omics technologies can help identify spatially organized biological processes, but existing computational approaches often overlook structural dependencies in the data. Here, we introduce Smoother, a unified framework that integrates positional information into non-spatial models via modular priors and losses. In simulated and real datasets, Smoother enables accurate data imputation, cell-type deconvolution, and dimensionality reduction with remarkable efficiency.

View Article and Find Full Text PDF

DIISCO: A Bayesian framework for inferring dynamic intercellular interactions from time-series single-cell data.

Cameron Park , Shouvik Mani , Nicolas Beltran-Velez , Katie Maurer , Satyen Gohil , David A Knowles

bioRxiv

November 2023

Characterizing cell-cell communication and tracking its variability over time is essential for understanding the coordination of biological processes mediating normal development, progression of disease, or responses to perturbations such as therapies. Existing tools lack the ability to capture time-dependent intercellular interactions, such as those influenced by therapy, and primarily rely on existing databases compiled from limited contexts. We present DIISCO, a Bayesian framework for characterizing the temporal dynamics of cellular interactions using single-cell RNA-sequencing data from multiple time points.

View Article and Find Full Text PDF

Phenotypic complexities of rare heterozygous neurexin-1 deletions.

Michael B Fernando , Yu Fan , Yanchun Zhang , Alex Tokolyi , Aleta N Murphy , David A Knowles

bioRxiv

November 2024

Given the large number of genes significantly associated with risk for neuropsychiatric disorders, a critical unanswered question is the extent to which diverse mutations --sometimes impacting the same gene-- will require tailored therapeutic strategies. Here we consider this in the context of rare neuropsychiatric disorder-associated copy number variants (2p16.3) resulting in heterozygous deletions in , a pre-synaptic cell adhesion protein that serves as a critical synaptic organizer in the brain.

View Article and Find Full Text PDF

Large-scale causal discovery using interventional data sheds light on the regulatory network architecture of blood traits.

Brielin C Brown , John A Morris , Tuuli Lappalainen , David A Knowles

bioRxiv

October 2023

Inference of directed biological networks is an important but notoriously challenging problem. We introduce , an approach to learning causal networks that leverages large-scale intervention-response data. Applied to 788 genes from the genome-wide perturb-seq dataset, helps elucidate the network architecture of blood traits.

View Article and Find Full Text PDF

Cas13d-mediated isoform-specific RNA knockdown with a unified computational and experimental toolbox.

Megan D Schertzer , Andrew Stirn , Keren Isaev , Laura Pereira , Anjali Das , David A Knowles

bioRxiv

September 2023

Alternative splicing is an essential mechanism for diversifying proteins, in which mature RNA isoforms produce proteins with potentially distinct functions. Two major challenges in characterizing the cellular function of isoforms are the lack of experimental methods to specifically and efficiently modulate isoform expression and computational tools for complex experimental design. To address these gaps, we developed and methodically tested a strategy which pairs the RNA-targeting CRISPR/Cas13d system with guide RNAs that span exon-exon junctions in the mature RNA.

View Article and Find Full Text PDF

Multiset correlation and factor analysis enables exploration of multi-omics data.

Brielin C Brown , Collin Wang , Silva Kasela , François Aguet , Daniel C Nachun , David A Knowles

Cell Genom

August 2023

Article Synopsis

Multi-omics datasets are increasingly popular, creating a need for integration methods to unlock their potential, which is addressed by a new technique called multi-set correlation and factor analysis (MCFA) that aids in analyzing complex genomic data.
MCFA was applied to various biological data (methylation, protein, RNA, and metabolite levels) from 614 samples, revealing strong clustering by ancestry without the need for genetic data and highlighting unique technical variations in individual datasets.
The study also incorporated genetic data through a genome-wide association study (GWAS), identifying several factors linked to genetic traits and metabolic diseases, thereby setting a groundwork for future research using large multi-modal genomic datasets.

View Article and Find Full Text PDF

Single-cell multi-omics defines the cell-type-specific impact of splicing aberrations in human hematopoietic clonal outgrowths.

Mariela Cortés-López , Paulina Chamely , Allegra G Hawkins , Robert F Stanley , Ariel D Swett , David A Knowles

Cell Stem Cell

September 2023

Article Synopsis

RNA splicing factors often mutate in blood disorders like myelodysplastic syndrome (MDS), affecting how blood cells develop, but the role of these mutations in blood formation is still not fully understood.
Researchers used a new method, GoT-Splice, which combines gene profiling and advanced single-cell analysis to study how mutations in a specific splicing factor (SF3B1) influence blood progenitor cells.
Their findings showed that SF3B1 mutations lead to abnormal splicing patterns and an increase in specific blood cell types before MDS is clinically evident, highlighting the importance of understanding these mutations in early disease progression.

View Article and Find Full Text PDF

Prediction of on-target and off-target activity of CRISPR-Cas13d guide RNAs using deep learning.

Hans-Hermann Wessels , Andrew Stirn , Alejandro Méndez-Mancilla , Eric J Kim , Sydney K Hart , David A Knowles

Nat Biotechnol

April 2024

Transcriptome engineering applications in living cells with RNA-targeting CRISPR effectors depend on accurate prediction of on-target activity and off-target avoidance. Here we design and test ~200,000 RfxCas13d guide RNAs targeting essential genes in human cells with systematically designed mismatches and insertions and deletions (indels). We find that mismatches and indels have a position- and context-dependent impact on Cas13d activity, and mismatches that result in G-U wobble pairings are better tolerated than other single-base mismatches.

View Article and Find Full Text PDF