98%
921
2 minutes
20
Using -mers to find sequence matches is increasingly used in many bioinformatic applications, including metagenomic sequence classification. The accuracy of these down-stream applications relies on the density of the reference databases, which, luckily, are rapidly growing. While the increased density provides hope for dramatic improvements in accuracy, scalability is a concern. Reference -mers are kept in the memory during the query time, and saving all -mers of these ever-expanding databases is fast becoming impractical. Several strategies for subsampling have been proposed, including minimizers and finding taxon-specific -mers. However, we contend that these strategies are inadequate, especially when reference sets are taxonomically imbalanced, as are most microbial libraries. In this paper, we explore approaches for selecting a fixed-size subset of -mers present in an ultra-large dataset to include in a library such that the classification of reads suffers the least. Our experiments demonstrate the limitations of existing approaches, especially for novel and poorly sampled groups. We propose a library construction algorithm called KRANK (K-mer RANKer) that combines several components, including a hierarchical selection strategy with adaptive size restrictions and an equitable coverage strategy. We implement KRANK in highly optimized code and combine it with the locality-sensitive-hashing classifier CONSULT-II to build a taxonomic classification and profiling method. On several benchmarks, KRANK -mer selection dramatically reduces memory consumption with minimal loss in classification accuracy. We show in extensive analyses based on CAMI benchmarks that KRANK outperforms -mer-based alternatives in terms of taxonomic profiling and comes close to the best marker-based methods in terms of accuracy.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11257464 | PMC |
http://dx.doi.org/10.1101/2024.02.12.580015 | DOI Listing |
J Phys Chem A
September 2025
Department of Chemistry, Graduate School of Science, Tohoku University, Sendai 980-8578, Japan.
The gas-phase structures of dibenzo-24-crown-8 (DB24C8) and dinaphtho-24-crown-8 (DN24C8) complexes with divalent metal ions (Mg, Ca, Sr, Ba, Fe, Ni, and Zn) were investigated by cryogenic ion mobility-mass spectrometry (IM-MS) in combination with density functional theory calculations. Several complexes, particularly those of DN24C8, exhibited multiple coexisting conformers. DFT-optimized structures were classified based on the relative orientation of the two aromatic rings in the crown ether.
View Article and Find Full Text PDFACS Omega
August 2025
Institute for Materials Chemistry and Engineering, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan.
Survivin, a protein overexpressed in various fetal and malignant tumor tissues, induces tumor progression and resistance to cancer therapy. Cell surface vimentin has -acetylglucosamine (GlcNAc)-binding activities in several cell types including tumor cells. Furthermore, GlcNAc-bearing polymers downregulate the expression of the survivin-encoding baculoviral inhibitor of apoptosis protein repeat-containing protein 5 ().
View Article and Find Full Text PDFElife
August 2025
Howard Hughes Medical Institute, Seattle, United States.
Somatic hypermutation (SHM) is the diversity-generating process in antibody affinity maturation. Probabilistic models of SHM are needed for analyzing rare mutations, understanding the selective forces guiding affinity maturation, and understanding the underlying biochemical process. High-throughput data offers the potential to develop and fit models of SHM on relevant data sets.
View Article and Find Full Text PDFAntimicrob Agents Chemother
August 2025
Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, USA.
BMS-818251, a fostemsavir analog, is a next-generation HIV-1 attachment inhibitor with enhanced potency and a similar resistance profile. By using viral outgrowth assays with HIV+ donor samples, we demonstrate here that BMS-818251 exhibits superior viral suppression compared to temsavir, the active form of fostemsavir. To map potential resistance pathways, we employed deep mutational scanning and pseudotyped virus neutralization assays to identify escape mutations within the HIV-1 envelope glycoprotein (Env).
View Article and Find Full Text PDFOrg Lett
September 2025
Faculty of Medicine, Dentistry and Pharmaceutical Sciences, Okayama University, 1-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan.
Non-natural base pair formation provides insight into new functions of nucleic acids. Therefore, various artificial base pairs have been developed in both DNA and RNA. In this work, we successfully synthesized pseudocytidine from commercially available pseudouridine to form base pairs with isoguanine, also known as 2-OH-adenine, in RNA.
View Article and Find Full Text PDF