Am J Hum Genet
September 2025
Increased availability of whole-genome sequencing (WGS) has facilitated the study of rare variants (RVs) in complex diseases. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most do not fully leverage the availability of variant-level functional annotations. We propose genome-wide rare variant enrichment evaluation (gruyere), an empirical Bayesian framework that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization.
View Article and Find Full Text PDFBiol Psychiatry Cogn Neurosci Neuroimaging
August 2025
Background: Neuroanatomical variation in individuals with bipolar disorder (BD) has been previously described in observational studies. However, the causal dynamics of these relationships remain unexplored.
Methods: We performed Mendelian Randomization of 297 structural and functional neuroimaging phenotypes from the UK Biobank and BD using GWAS summary statistics.
Pre- and post-transcriptional mechanisms, including alternative promoters, termination signals, and splicing, play essential roles in diversifying protein output by generating distinct RNA and protein isoforms. Two major challenges in characterizing the cellular function of alternative isoforms are the lack of experimental methods to specifically and efficiently modulate isoform expression and computational tools for complex experimental design and analysis. To address these gaps, we develop and methodically test an isoform-specific knockdown strategy which pairs the RNA-targeting CRISPR/Cas13d system with guide RNAs that span exon-exon junctions.
View Article and Find Full Text PDFUnlabelled: Transcript diversity including splicing and alternative 3'end usage is crucial for cellular identity and adaptation, yet its spatial coordination remains poorly understood. Here, we present SPLISOSM (SpatiaL ISOform Statistical Modeling), a computational framework for detecting isoform-resolution patterns from spatial transcriptomics data. SPLISOSM leverages multivariate testing to account for spot- and isoform-level dependencies, demonstrating robust and theoretically grounded performance on sparse data.
View Article and Find Full Text PDFLong-read sequencing (LRS) has revealed a far greater diversity of RNA isoforms than earlier technologies, increasing the critical need to determine which, and how many, isoforms per gene are biologically meaningful. To define the space of relevant isoforms from LRS, many existing analysis pipelines rely on arbitrary expression cutoffs, but a single threshold cannot accommodate the broad variability in isoform complexity across genes, cell-types, and disease states captured by LRS. To address this, we propose using -an interpretable measure derived from entropy-that quantifies the effective number of isoforms per gene based on the full, unfiltered isoform ratio distribution.
View Article and Find Full Text PDFPhilos Trans A Math Phys Eng Sci
July 2025
Structural integrity for fusion is an integrated multi-disciplinary subject spanning the science of materials, technology, engineering, health monitoring and simulation methods and algorithms for scrutinizing the assurance of reliable fusion reactor performance from the whole plant design phase through operation to decommissioning. Structural integrity is essential for maintaining high standards of public, environmental and investment protection and maximizing economic benefits. While fusion shares many of the structural integrity challenges faced by other industries, it also presents unique complexities.
View Article and Find Full Text PDFMosaic chromosomal alterations (mCAs) in blood, a form of clonal hematopoiesis, have been linked to various diseases, but their role in Alzheimer's disease (AD) remains unclear. We analyzed blood whole-genome sequencing (WGS) data from 24,049 individuals in the Alzheimer's Disease Sequencing Project and found that autosomal mCAs were significantly associated with increased AD risk (odds ratio = 1.27; = 1.
View Article and Find Full Text PDFPathogenic variants in the neuronal Na/K ATPase transmembrane ion transporter (ATP1A3) cause a spectrum of neurological disorders including alternating hemiplegia of childhood (AHC). The most common de novo pathogenic variants in AHC are p.D801N (∼40 % of patients) and p.
View Article and Find Full Text PDFGroups of complex diseases, such as coronary heart diseases, neuropsychiatric disorders, and cancers, often display overlapping clinical symptoms and pharmacological treatments. The shared associations of genetic variants across diseases has the potential to explain their underlying biological processes, but this remains poorly understood. To address this, we model the matrix of summary statistics of trait-associated genetic variants as the sum of a low-rank component - representing shared biological processes - and a sparse component, representing unique processes and arbitrarily corrupted or contaminated components.
View Article and Find Full Text PDFGiven the large number of genes significantly associated with risk for neuropsychiatric disorders, a critical unanswered question is the extent to which diverse mutations-sometimes affecting the same gene-will require tailored therapeutic strategies. Here we consider this in the context of rare neuropsychiatric disorder-associated copy number variants (2p16.3) resulting in heterozygous deletions in NRXN1, which encodes a presynaptic cell-adhesion protein that serves as a critical synaptic organizer in the brain.
View Article and Find Full Text PDFMost genetic risk variants for neurological diseases are located in non-coding regulatory regions, where they may often act as expression quantitative trait loci (eQTLs), modulating gene expression and influencing disease susceptibility. However, eQTL studies in bulk brain tissue or specific cell types lack the resolution to capture the brain's cellular diversity. Single-nucleus RNA sequencing (snRNA-seq) offers high-resolution mapping of eQTLs across diverse brain cell types.
View Article and Find Full Text PDFJ Comput Assist Tomogr
March 2025
Computed tomography plays an ever-increasing role in the management of fractures and dislocations due to its capability in efficiently providing multiplanar reformats and 3-dimensional volume rendered images. It can reveal findings that are occult on plain radiography and therefore allow for more accurate decision making with regard to fracture classification and management. Clinical radiologists play a critical role in facilitating the processing of imaging to provide adequate image reformats in the desired planes, producing 3 dimensional images but most crucially identifying pertinent findings, which will contribute between the selection of nonoperative and operative management and potentially influence surgical technique.
View Article and Find Full Text PDFNeuroanatomical variation in individuals with bipolar disorder (BD) has been previously described in observational studies. However, the causal dynamics of these relationships remain unexplored. We performed Mendelian Randomization of 297 structural and functional neuroimaging phenotypes from the UK BioBank and BD using genome-wide association study summary statistics.
View Article and Find Full Text PDFThe increasing availability of whole-genome sequencing (WGS) has begun to elucidate the contribution of rare variants (RVs), both coding and non-coding, to complex disease. Multiple RV association tests are available to study the relationship between genotype and phenotype, but most are restricted to per-gene models and do not fully leverage the availability of variant-level functional annotations. We propose Genome-wide Rare Variant EnRichment Evaluation (gruyere), a Bayesian probabilistic model that complements existing methods by learning global, trait-specific weights for functional annotations to improve variant prioritization.
View Article and Find Full Text PDFThis paper demonstrates the utility of organized numerical representations of genes in research involving flat string gene formats (i.e., FASTA/FASTQ).
View Article and Find Full Text PDFThe success of machine learning models relies heavily on effectively representing high-dimensional data. However, ensuring data representations capture human-understandable concepts remains difficult, often requiring the incorporation of prior knowledge and decomposition of data into multiple subspaces. Traditional linear methods fall short in modeling more than one space, while more expressive deep learning approaches lack interpretability.
View Article and Find Full Text PDFCharacterizing cell-cell communication and tracking its variability over time are crucial for understanding the coordination of biological processes mediating normal development, disease progression, and responses to perturbations such as therapies. Existing tools fail to capture time-dependent intercellular interactions and primarily rely on databases compiled from limited contexts. We introduce DIISCO, a Bayesian framework designed to characterize the temporal dynamics of cellular interactions using single-cell RNA-sequencing data from multiple time points.
View Article and Find Full Text PDFSpatial omics technologies can help identify spatially organized biological processes, but existing computational approaches often overlook structural dependencies in the data. Here, we introduce Smoother, a unified framework that integrates positional information into non-spatial models via modular priors and losses. In simulated and real datasets, Smoother enables accurate data imputation, cell-type deconvolution, and dimensionality reduction with remarkable efficiency.
View Article and Find Full Text PDFCharacterizing cell-cell communication and tracking its variability over time is essential for understanding the coordination of biological processes mediating normal development, progression of disease, or responses to perturbations such as therapies. Existing tools lack the ability to capture time-dependent intercellular interactions, such as those influenced by therapy, and primarily rely on existing databases compiled from limited contexts. We present DIISCO, a Bayesian framework for characterizing the temporal dynamics of cellular interactions using single-cell RNA-sequencing data from multiple time points.
View Article and Find Full Text PDFGiven the large number of genes significantly associated with risk for neuropsychiatric disorders, a critical unanswered question is the extent to which diverse mutations --sometimes impacting the same gene-- will require tailored therapeutic strategies. Here we consider this in the context of rare neuropsychiatric disorder-associated copy number variants (2p16.3) resulting in heterozygous deletions in , a pre-synaptic cell adhesion protein that serves as a critical synaptic organizer in the brain.
View Article and Find Full Text PDFInference of directed biological networks is an important but notoriously challenging problem. We introduce , an approach to learning causal networks that leverages large-scale intervention-response data. Applied to 788 genes from the genome-wide perturb-seq dataset, helps elucidate the network architecture of blood traits.
View Article and Find Full Text PDFAlternative splicing is an essential mechanism for diversifying proteins, in which mature RNA isoforms produce proteins with potentially distinct functions. Two major challenges in characterizing the cellular function of isoforms are the lack of experimental methods to specifically and efficiently modulate isoform expression and computational tools for complex experimental design. To address these gaps, we developed and methodically tested a strategy which pairs the RNA-targeting CRISPR/Cas13d system with guide RNAs that span exon-exon junctions in the mature RNA.
View Article and Find Full Text PDFTranscriptome engineering applications in living cells with RNA-targeting CRISPR effectors depend on accurate prediction of on-target activity and off-target avoidance. Here we design and test ~200,000 RfxCas13d guide RNAs targeting essential genes in human cells with systematically designed mismatches and insertions and deletions (indels). We find that mismatches and indels have a position- and context-dependent impact on Cas13d activity, and mismatches that result in G-U wobble pairings are better tolerated than other single-base mismatches.
View Article and Find Full Text PDF