Visualization of a Machine Learning Framework toward Highly Sensitive Qualitative Analysis by SERS.

Anal Chem

State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China.

Published: July 2022


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Surface-enhanced Raman spectroscopy (SERS), providing near-single-molecule-level fingerprint information, is a powerful tool for the trace analysis of a target in a complicated matrix and is especially facilitated by the development of modern machine learning algorithms. However, both the high demand of mass data and the low interpretability of the mysterious black-box operation significantly limit the well-trained model to real systems in practical applications. Aiming at these two issues, we constructed a novel machine learning algorithm-based framework (Vis-CAD), integrating visual random forest, characteristic amplifier, and data augmentation. The introduction of data augmentation significantly reduced the requirement of mass data, and the visualization of the random forest clearly presented the captured features, by which one was able to determine the reliability of the algorithm. Taking the trace analysis of individual polycyclic aromatic hydrocarbons in a mixture as an example, a trustworthy accuracy no less than 99% was realized under the optimized condition. The visualization of the algorithm framework distinctly demonstrated that the captured feature was well correlated to the characteristic Raman peaks of each individual. Furthermore, the sensitivity toward the trace individual could be improved by least 1 order of magnitude as compared to that with the naked eye. The proposed algorithm distinguished by the lesser demand of mass data and the visualization of the operation process offers a new way for the indestructible application of machine learning algorithms, which would bring push-to-the-limit sensitivity toward the qualitative and quantitative analysis of trace targets, not only in the field of SERS, but also in the much wider spectroscopy world. It is implemented in the Python programming language and is open-source at https://github.com/3331822w/Vis-CAD.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.analchem.2c01450DOI Listing

Publication Analysis

Top Keywords

machine learning
16
mass data
12
trace analysis
8
learning algorithms
8
demand mass
8
random forest
8
data augmentation
8
data visualization
8
data
5
visualization
4

Similar Publications

Aim: The purpose of this study was to assess the accuracy of a customized deep learning model based on CNN and U-Net for detecting and segmenting the second mesiobuccal canal (MB2) of maxillary first molar teeth on cone beam computed tomography (CBCT) scans.

Methodology: CBCT scans of 37 patients were imported into 3D slicer software to crop and segment the canals of the mesiobuccal (MB) root of the maxillary first molar. The annotated data were divided into two groups: 80% for training and validation and 20% for testing.

View Article and Find Full Text PDF

Obsessive-compulsive disorder (OCD) is a chronic and disabling condition affecting approximately 3.5% of the global population, with diagnosis on average delayed by 7.1 years or often confounded with other psychiatric disorders.

View Article and Find Full Text PDF

Early prediction of orthodontic gingival enlargement using S100A4: a biomarker-based risk stratification model.

Odontology

September 2025

Department of Periodontics, Saveetha Dental College and Hospital, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India.

Orthodontic-induced gingival enlargement (OIGE) affects approximately 15-30% of patients undergoing orthodontic treatment and remains largely unpredictable, often relying on subjective clinical assessments made after irreversible tissue changes have occurred. S100A4 is a well-characterized marker of activated fibroblasts involved in pathological tissue remodeling. This was a cross-sectional precision biomarker study that analyzed gingival tissue samples from three groups: healthy controls (n = 60), orthodontic patients without gingival enlargement (n = 31), and patients with clinically diagnosed OIGE (n = 61).

View Article and Find Full Text PDF

Purpose: The study aims to compare the treatment recommendations generated by four leading large language models (LLMs) with those from 21 sarcoma centers' multidisciplinary tumor boards (MTBs) of the sarcoma ring trial in managing complex soft tissue sarcoma (STS) cases.

Methods: We simulated STS-MTBs using four LLMs-Llama 3.2-vison: 90b, Claude 3.

View Article and Find Full Text PDF