98%
921
2 minutes
20
Kernel methods are powerful machine learning techniques which use generic non-linear functions to solve complex tasks. They have a solid mathematical foundation and exhibit excellent performance in practice. However, kernel machines are still considered black-box models as the kernel feature mapping cannot be accessed directly thus making the kernels difficult to interpret. The aim of this work is to show that it is indeed possible to interpret the functions learned by various kernel methods as they can be intuitive despite their complexity. Specifically, we show that derivatives of these functions have a simple mathematical formulation, are easy to compute, and can be applied to various problems. The model function derivatives in kernel machines is proportional to the kernel function derivative and we provide the explicit analytic form of the first and second derivatives of the most common kernel functions with regard to the inputs as well as generic formulas to compute higher order derivatives. We use them to analyze the most used supervised and unsupervised kernel learning methods: Gaussian Processes for regression, Support Vector Machines for classification, Kernel Entropy Component Analysis for density estimation, and the Hilbert-Schmidt Independence Criterion for estimating the dependency between random variables. For all cases we expressed the derivative of the learned function as a linear combination of the kernel function derivative. Moreover we provide intuitive explanations through illustrative toy examples and show how these same kernel methods can be applied to applications in the context of spatio-temporal Earth system data cubes. This work reflects on the observation that function derivatives may play a crucial role in kernel methods analysis and understanding.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7595302 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0235885 | PLOS |
IEEE Trans Pattern Anal Mach Intell
September 2025
Stochastic Kriging (SK) is a generalized variant of Gaussian process regression, and it is developed for dealing with non-i.i.d.
View Article and Find Full Text PDFJ Chem Phys
September 2025
National Synchrotron Radiation Laboratory, State Key Laboratory of Advanced Glass Materials, Anhui Provincial Engineering Research Center for Advanced Functional Polymer Films, University of Science and Technology of China, Hefei, Anhui 230029, China.
Polymer density is a critical factor influencing material performance and industrial applications, and it can be tailored by modifying the chemical structure of repeating units. Traditional polymer density characterization methods rely heavily on domain expertise; however, the vast chemical space comprising over one million potential polymer structures makes conventional experimental screening inefficient and costly. In this study, we proposed a machine learning framework for polymer density prediction, rigorously evaluating four models: neural networks (NNs), random forest (RF), XGBoost, and graph convolutional neural networks (GCNNs).
View Article and Find Full Text PDFJ Chem Phys
September 2025
Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany.
Coarse-grained (CG) molecular dynamics simulations extend the length and time scales of atomistic simulations by replacing groups of correlated atoms with CG beads. Machine-learned coarse-graining (MLCG) has recently emerged as a promising approach to construct highly accurate force fields for CG molecular dynamics. However, the calibration of MLCG force fields typically hinges on force matching, which demands extensive reference atomistic trajectories with corresponding force labels.
View Article and Find Full Text PDFScientifica (Cairo)
August 2025
Department of Biology, School of Bioscience and Technology, College of Natural Sciences, Wollo University, Dessie, Ethiopia.
The gelada (), Ethiopia's only endemic primate and the last surviving graminivorous cercopithecid, was studied in Susgen Natural Forest, South Wollo, to examine seasonal variations in activity budgets and ranging ecology. From February to August 2023, encompassing both dry and wet seasons, 3519 behavioral scans were collected from 1680 group observations using instantaneous scan sampling at 15-min intervals (07:00-17:00 h). Data were analyzed with descriptive statistics and nonparametric tests (Kruskal-Wallis and Mann-Whitney ), while home ranges were mapped via minimum convex polygon (MCP) and kernel density estimation (KDE).
View Article and Find Full Text PDFJ Affect Disord
September 2025
The Radiology Department of Shanxi Provincial People' Hospital Affiliated to Shanxi Medical University, Taiyuan, 030001, China. Electronic address:
Objective: The aim of this study was to develop a diagnostic model for bipolar disorder (BD) using Genetic Algorithm-Optimized Kernel Partial Least Squares (GA-KPLS) and to identify key genes associated with the disorder.
Methods: Gene expression data from 448 BD patients were analyzed to identify differentially expressed genes (DEGs). The GA-KPLS model was constructed and compared with six traditional models: Random Forest, LASSO, Ridge Regression, Support Vector Machine, Neural Network, and Logistic Regression.