A semiparametric kernel independence test with application to mutational signatures.

J Am Stat Assoc

Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.

Published: February 2021


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Cancers arise owing to somatic mutations, and the characteristic combinations of somatic mutations form mutational signatures. Despite many mutational signatures being identified, mutational processes underlying a number of mutational signatures remain unknown, which hinders the identification of interventions that may reduce somatic mutation burdens and prevent the development of cancer. We demonstrate that the unknown cause of a mutational signature can be inferred by the associated signatures with known etiology. However, existing association tests are not statistically powerful due to excess zeros in mutational signatures data. To address this limitation, we propose a semiparametric kernel independence test (SKIT). The SKIT statistic is defined as the integrated squared distance between mixed probability distributions and is decomposed into four disjoint components to pinpoint the source of dependency. We derive the asymptotic null distribution and prove the asymptotic convergence of power. Due to slow convergence to the asymptotic null distribution, a bootstrap method is employed to compute -values. Simulation studies demonstrate that when zeros are prevalent, SKIT is more resilient to power loss than existing tests and robust to random errors. We applied SKIT to The Cancer Genome Atlas (TCGA) mutational signatures data for over 9,000 tumors across 32 cancer types, and identified a novel association between signature 17 curated in the Catalogue Of Somatic Mutations In Cancer (COSMIC) and apolipoprotein B mRNA editing enzyme (APOBEC) signatures in gastrointestinal cancers. It indicates that APOBEC activity is likely associated with the unknown cause of signature 17.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9070557PMC
http://dx.doi.org/10.1080/01621459.2020.1871357DOI Listing

Publication Analysis

Top Keywords

mutational signatures
24
somatic mutations
12
semiparametric kernel
8
kernel independence
8
independence test
8
mutational
8
signatures
8
signatures data
8
asymptotic null
8
null distribution
8

Similar Publications

Distinct codon usage signatures reflecting evolutionary and pathogenic adaptation in the Acinetobacter baumannii complex.

Eur J Clin Microbiol Infect Dis

September 2025

School of Bioengineering and Biosciences, Department of Biochemistry, Lovely Professional University, Punjab, 144411, India.

Purpose: This study investigates codon usage and amino acid usage bias in the genus Acinetobacter to uncover the evolutionary forces shaping these patterns and their implications for pathogenicity and biotechnology.

Methods: Codon usage patterns were examined in representative genomes of the genus Acinetobacter using standard codon bias indices, including GC content, relative synonymous codon usage (RSCU), effective number of codons (ENC), and codon adaptation index (CAI). Neutrality and parity plots were employed to evaluate the relative influence of mutational pressure and natural selection on codon preferences.

View Article and Find Full Text PDF

The microglial surface protein Triggering Receptor Expressed on Myeloid Cells 2 (TREM2) plays a critical role in mediating brain homeostasis and inflammatory responses in Alzheimer's disease (AD). The soluble form of TREM2 (sTREM2) exhibits neuroprotective effects in AD, though the underlying mechanisms remain elusive. Moreover, differences in ligand binding between TREM2 and sTREM2, which have major implications for their roles in AD pathology, remain unexplained.

View Article and Find Full Text PDF

Given the limited diagnostic technologies and treatment options available for lung adenocarcinoma (LUAD) patients with liver metastases, it is crucial to identify potential genomic signatures associated with liver metastasis, which could significantly contribute to the development of improved diagnostic tools and treatment strategies for LUAD patients with liver metastases. In this study, we identified specific genetic alterations in tumor samples with liver metastases by targeted capture sequencing. The results showed that the significantly higher mutation frequencies of , and in LUAD patients with liver metastases and and mutations found in both tumor tissues and plasma samples from patients with liver metastases.

View Article and Find Full Text PDF

Signatures of selective sweeps in continuous-space populations.

Genetics

September 2025

Department of Computational Biology, Cornell University, Ithaca, NY 14853, USA.

Selective sweeps describe the process by which an adaptive mutation arises and rapidly fixes in the population, thereby removing genetic variation in its genomic vicinity. The expected signatures of selective sweeps are relatively well understood in panmictic population models, yet natural populations often extend across larger geographic ranges where individuals are more likely to mate with those born nearby. To investigate how such spatial population structure can affect sweep dynamics and signatures, we simulated selective sweeps in populations inhabiting a two-dimensional continuous landscape.

View Article and Find Full Text PDF

The European Health Data Space (EHDS) will help researchers use health data across EU Member States (MS). Currently, cross-border research faces heterogeneous data access processes. Using a real-world use case, this paper analyses challenges and opportunities brought by the upcoming implementation of the EHDS, assessing the situation before and after the regulation comes into force.

View Article and Find Full Text PDF