98%
921
2 minutes
20
Typical genotyping workflows map reads to a reference genome before identifying genetic variants. Generating such alignments introduces reference biases and comes with substantial computational burden. Furthermore, short-read lengths limit the ability to characterize repetitive genomic regions, which are particularly challenging for fast k-mer-based genotypers. In the present study, we propose a new algorithm, PanGenie, that leverages a haplotype-resolved pangenome reference together with k-mer counts from short-read sequencing data to genotype a wide spectrum of genetic variation-a process we refer to as genome inference. Compared with mapping-based approaches, PanGenie is more than 4 times faster at 30-fold coverage and achieves better genotype concordances for almost all variant types and coverages tested. Improvements are especially pronounced for large insertions (≥50 bp) and variants in repetitive regions, enabling the inclusion of these classes of variants in genome-wide association studies. PanGenie efficiently leverages the increasing amount of haplotype-resolved assemblies to unravel the functional impact of previously inaccessible variants while being faster compared with alignment-based workflows.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9005351 | PMC |
http://dx.doi.org/10.1038/s41588-022-01043-w | DOI Listing |
J Affect Disord
September 2025
Department of Neurology, The First Affiliated Hospital of Guangxi University of Chinese Medicine, Guangxi University of Chinese Medicine, Nanning, 530023, PR China. Electronic address:
Objective: Major depressive disorder (MDD) is among the most prevalent and debilitating mental health conditions worldwide. This study aims to investigate the bidirectional causal relationship between immune cells and MDD using Mendelian randomization (MR) analysis and determine whether metabolites mediate this relationship.
Methods: We compiled and analyzed whole-genome data for 731 immune cell traits, 1091 blood metabolites, 309 metabolic ratios, and disease data from 170,756 individuals with MDD and 329,443 controls.
Cell Syst
September 2025
Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA. Electronic address:
Spatial transcriptomics allows for the measurement of gene expression within the native tissue context. However, despite technological advancements, computational methods to link cell states with their microenvironment and compare these relationships across samples and conditions remain limited. To address this, we introduce Tissue Motif-Based Spatial Inference across Conditions (TissueMosaic), a self-supervised convolutional neural network designed to discover and represent tissue architectural motifs from multi-sample spatial transcriptomic datasets.
View Article and Find Full Text PDFComput Biol Chem
September 2025
Department of Biotechnology, Deenbandhu Chhotu Ram University of Science & Technology, Murthal, Haryana 131039, India. Electronic address:
Lentinula edodes (shiitake mushroom) is a widely cultivated edible and medicinal fungus, valued for its bioactive compounds. While East Asian strains have been well studied, Indian populations remain under-characterized. This study explores the genetic and functional diversity of five Indian-origin L.
View Article and Find Full Text PDFMol Biol Rep
September 2025
ICAR-Central Institute of Fisheries Education, Versova, Mumbai, 400061, India.
Background: Labeo fimbriatus (Bloch, 1795) is a medium-sized South Asian minor carp with ecological significance and emerging aquaculture potential, particularly in polyculture systems with Indian major carps. Despite its wide distribution, it remains underrepresented in phylogenetic studies, and limited genomic resources are available. Here, we report the complete mitochondrial genome sequence of L.
View Article and Find Full Text PDFSyst Biol
September 2025
Department of Ecology, Evolution, and Environmental Biology, Columbia University, New York, NY 10027, USA.
Genomes are composed of a mosaic of segments inherited from different ancestors, each separated by past recombination events. Consequently, genealogical relationships among multiple genomes vary spatially across different genomic regions. Genealogical variation among unlinked (uncorrelated) genomic regions is well described for either a single population (coalescent) or multiple structured populations (multispecies coalescent).
View Article and Find Full Text PDF