Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Infrared (IR) spectroscopy, a type of vibrational spectroscopy, provides extensive molecular structure details and is a highly effective technique for chemists to determine molecular structures. However, analyzing experimental spectra has always been challenging due to the specialized knowledge required and the variability of spectra under different experimental conditions. Here, we propose a transformer-based model with a patch-based self-attention spectrum embedding layer, designed to prevent the loss of spectral information while maintaining simplicity and effectiveness. To further enhance the model's understanding of IR spectra, we introduce a data augmentation approach, which selectively introduces vertical noise only at absorption peaks. Our approach not only achieves state-of-the-art performance on simulated data sets but also attains a top-1 accuracy of 55% on real experimental spectra, surpassing the previous state-of-the-art by approximately 10%. Additionally, our model demonstrates proficiency in analyzing intricate and variable fingerprint regions, effectively extracting critical structural information.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jpca.4c05665DOI Listing

Publication Analysis

Top Keywords

molecular structures
8
patch-based self-attention
8
experimental spectra
8
spectra
5
transformer-based models
4
models predicting
4
predicting molecular
4
structures infrared
4
infrared spectra
4
spectra patch-based
4

Similar Publications

Population genetics plays a critical role in creating policies for managing fisheries, conservation, and development of aquaculture. The golden snapper, Lutjanus johnii (Bloch, 1792), is a highly commercial and aquaculture important snapper species. This study used mitochondrial markers D-loop (151 specimens) and Cytochrome b (Cyt-b, 120 specimens) from 10 populations, including populations from the east South China Sea, the west South China Sea and the Strait of Malacca to investigate the genetic diversity, population connectivity, and historical demography of L.

View Article and Find Full Text PDF

The first complete plastid genome of the critically endangered species Valeriana trinervis was sequenced, assembled and compared with other published Valeriana plastomes. In this study, we assembled the plastid genome of the critically endangered, endemic species Valeriana trinervis (= Centranthus trinervis) and compare it with all published plastomes of Valeriana. We found not only differences in the inverted repeats boundaries, in the type and abundance of repeats, but also similarities in codon usage and microsatellite numbers.

View Article and Find Full Text PDF

Traditional drug discovery methods like high-throughput screening and molecular docking are slow and costly. This study introduces a machine learning framework to predict bioactivity (pIC₅₀) and identify key molecular properties and structural features for targeting Trypanothione reductase (TR), Protein kinase C theta (PKC-θ), and Cannabinoid receptor 1 (CB1) using data from the ChEMBL database. Molecular fingerprints, generated via PaDEL-Descriptor and RDKit, encoded structural features as binary vectors.

View Article and Find Full Text PDF

Cyclin-dependent kinase 20 (CDK20), also known as cell cycle-related kinase (CCRK), plays a pivotal role in hepatocellular carcinoma (HCC) progression by regulating β-catenin signaling and promoting uncontrolled proliferation. Despite its emerging significance, selective small-molecule inhibitors of CDK20 remain unexplored. In this study, a known CDK20 inhibitor, ISM042-2-048, was employed as a reference to retrieve structurally similar compounds from the PubChem database using an 85% similarity threshold.

View Article and Find Full Text PDF

Background: Hearing loss (HL) is one of the most common congenital anomalies and is a complex etiologically diverse condition. Molecular genetic characterization of HL remains challenging owing to the high genetic heterogeneity. This study aimed to screen for potential disease-causing genetic variations in a cohort of Indian patients with congenital bilateral severe-to-profound sensorineural HL.

View Article and Find Full Text PDF