98%
921
2 minutes
20
Identifying genomic regions shaped by natural selection is a central goal in evolutionary genomics. Existing machine learning methods for this task are typically trained using simulated genomic data labeled according to specific evolutionary scenarios. While effective in controlled settings, these models are limited by their reliance on explicit class labels. They can only detect the specific processes they were trained to recognize, making it difficult to interpret predictions for regions influenced by other evolutionary forces. This limitation is especially problematic when analyzing empirical genomes shaped by a mixture of adaptive, demographic, and ecological factors. One-vs.-rest strategies offer a potential alternative, but suffer from the inherent complexity of modeling all other evolutionary and demographic processes as a catch-all "rest" class. Here, we explore positive-unlabeled learning as a more flexible framework for detection of adaptive events. Positive-unlabeled learning is a semi-supervised approach that permits identification of samples of a target class using only positive labels and an unlabeled background, without requiring explicit modeling of negative samples. To assess the utility of this approach, we focus on a binary classification setting for detecting selective sweeps (positive samples) arising from positive natural selection against a mixed background composed of both unlabeled sweeps and neutrally-evolving regions. To accomplish this goal, we introduce , a method that employs only a set of labeled sweep observations for training while treating all remaining data as unlabeled. By avoiding assumptions about the composition of the background, enables robust sweep discovery in realistic genomic landscapes. We systematically evaluate its performance across a range of demographic, adaptive, and confounding contexts, including domain shift arising from misspecified demographic models, and find that delivers high performance and generalizability. To demonstrate a practical application of , we analyzed European and Bengali in Bangladesh human genomes, treating the empirical genomes as unlabeled sets, and recapitulating several previously-identified sweep candidates. Our results show that provides a powerful and versatile alternative for detecting adaptive regions, with the potential to generalize across a range of genomic landscapes.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393379 | PMC |
http://dx.doi.org/10.1101/2025.08.15.670602 | DOI Listing |
J Craniofac Surg
September 2025
Division of Plastic and Reconstructive Surgery Medical Center, Los Angeles, CA.
Auricular reconstruction is essential for restoring facial symmetry and achieving a well-contoured, natural-appearing ear. Traditional methods using autologous costal cartilage often delay reconstruction until around age 10, when sufficient rib cartilage is available, which can pose physical and psychological challenges for pediatric patients. Porous high-density polyethylene (PHDPE) implants offer significant advantages, including the ability to perform reconstruction earlier, reduced morbidity, improved ear definition, and the possibility of a single-stage outpatient procedure.
View Article and Find Full Text PDFAnnu Rev Microbiol
September 2025
4Institut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, France.
Cyanobacteria played a pivotal role in shaping Earth's early history and today are key players in many ecosystems. As versatile and ubiquitous phototrophs, they are used as models for oxygenic photosynthesis, nitrogen fixation, circadian rhythms, symbiosis, and adaptations to harsh environments. Cyanobacterial genomes and metagenomes exhibit high levels of genomic diversity partly driven by gene flow within and across species.
View Article and Find Full Text PDFSci Adv
September 2025
Department of Cell & Molecular Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
Somatic mitochondrial DNA (mtDNA) mutations are frequently observed in tumors, yet their role in pediatric cancers remains poorly understood. The heteroplasmic nature of mtDNA-where mutant and wild-type mtDNA coexist-complicates efforts to define its contribution to disease progression. In this study, bulk whole-genome sequencing of 637 matched tumor-normal samples from the Pediatric Cancer Genome Project revealed an enrichment of functionally impactful mtDNA variants in specific pediatric leukemia subtypes.
View Article and Find Full Text PDFPLoS One
September 2025
HUN-REN Centre for Ecological Research, Institute of Evolution, Budapest, Hungary.
We develop a model that integrates evolutionary matrix game theory with Mendelian genetics. Within this framework, we define the genotype dynamics that describes how the frequencies of genotypes change in sexual diploid populations. We show that our formal definition of evolutionary stability for genotype distributions implies the stability of the corresponding interior equilibrium point in the genotype dynamics.
View Article and Find Full Text PDFBioinformatics
September 2025
Department of Biostatistics, University of Pittsburgh, Pittsburgh, Pennsylvania United States.
Summary: Causal mediation analysis investigates the role of mediators in the relationship between exposure and outcome. In the analysis of omics or imaging data, mediators are often high-dimensional, presenting challenges such as multicollinearity and interpretability. Existing methods either compromise interpretability or fail to effectively prioritize mediators.
View Article and Find Full Text PDF