98%
921
2 minutes
20
Background: Molecular structures can be represented as strings of special characters using SMILES. Since each molecule is represented as a string, the similarity between compounds can be computed using SMILES-based string similarity functions. Most previous studies on drug-target interaction prediction use 2D-based compound similarity kernels such as SIMCOMP. To the best of our knowledge, using SMILES-based similarity functions, which are computationally more efficient than the 2D-based kernels, has not been investigated for this task before.
Results: In this study, we adapt and evaluate various SMILES-based similarity methods for drug-target interaction prediction. In addition, inspired by the vector space model of Information Retrieval we propose cosine similarity based SMILES kernels that make use of the Term Frequency (TF) and Term Frequency-Inverse Document Frequency (TF-IDF) weighting approaches. We also investigate generating composite kernels by combining our best SMILES-based similarity functions with the SIMCOMP kernel. With this study, we provided a comparison of 13 different ligand similarity functions, each of which utilizes the SMILES string of molecule representation. Additionally, TF and TF-IDF based cosine similarity kernels are proposed.
Conclusion: The more efficient SMILES-based similarity functions performed similarly to the more complex 2D-based SIMCOMP kernel in terms of AUC-ROC scores. The TF-IDF based cosine similarity obtained a better AUC-PR score than the SIMCOMP kernel on the GPCR benchmark data set. The composite kernel of TF-IDF based cosine similarity and SIMCOMP achieved the best AUC-PR scores for all data sets.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4797122 | PMC |
http://dx.doi.org/10.1186/s12859-016-0977-x | DOI Listing |
Anal Chem
September 2025
Department of Chemistry, The University of Akron, Akron, Ohio 44325, United States.
Tires are complex polymeric materials composed of rubber elastomers (both natural and synthetic), fillers, steel wire, textiles, and a range of antioxidant and curing systems. These constituents are distributed differently among the various tire parts, which are classified based on their function and proximity to the rim. This study presents a rapid and sensitive approach for the characterization of tire components using mild thermal desorption/pyrolysis (TDPy) coupled to direct analysis in real-time mass spectrometry (DART-MS).
View Article and Find Full Text PDFInorg Chem
September 2025
Pacific Northwest National Laboratory, Richland, Washington 99352, United States.
The solvation structure of an Np ion in an aqueous, noncomplexing and nonoxidizing environment of trifluoromethanesulfonic (triflic) acid was investigated with X-ray absorption spectroscopy (XAS) combined with ab initio molecular dynamics (AIMD) and time-dependent density functional theory (TDDFT) calculations. Np L-edge X-ray absorption near-edge structure (XANES) and extended X-ray absorption fine structure (EXAFS) data were collected for Np in 1, 3, and 7 M triflic acid using a laboratory-scale spectrometer and separately at a synchrotron facility, producing data sets in excellent agreement. TDDFT calculations revealed a weak pre-edge feature not previously reported for Np L-edge XANES.
View Article and Find Full Text PDFPlant J
September 2025
State Key Laboratory of Plant Diversity and Specialty Crops, Wuhan Botanical Garden, Chinese Academy of Science, Wuhan, Hubei, 430074, China.
Trapa L. is a non-cereal aquatic crop with significant economic and ecological value. However, debates over its classification have caused uncertainties in species differentiation and the mechanisms of polyploid speciation.
View Article and Find Full Text PDFMicrob Genom
September 2025
Department of Infectious Diseases and Public Health, Jockey Club College of Veterinary Medicine and Life Sciences, City University of Hong Kong, Hong Kong, PR China.
African swine fever virus (ASFV) is highly transmissible and can cause up to 100% mortality in pigs. The virus has spread across most regions of Asia and Europe, resulting in the deaths of millions of pigs. A deep understanding of the genetic diversity and evolutionary dynamics of ASFV is necessary to effectively manage outbreaks.
View Article and Find Full Text PDFMicrob Genom
September 2025
International Centre of Excellence for Aquatic Animal Health, The Centre for Environment, Fisheries and Aquaculture Science, Weymouth, DT4 8UB, UK.
High rates of mortality of the common cockle, , have occurred in the Wash Estuary, UK, since 2008. A previous study linked the mortalities to a novel genotype of , with a strong correlation between cockle moribundity and the presence of . Here, we characterize a novel iridovirus, identified by chance during metagenomic sequencing of a gradient purification of cells, with the presence also correlated to cockle moribundity.
View Article and Find Full Text PDF