98%
921
2 minutes
20
Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10498807 | PMC |
http://dx.doi.org/10.1080/21645515.2023.2251830 | DOI Listing |
Acta Psychiatr Scand
September 2025
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
Introduction: Machine learning studies sometimes include a high number of predictors relative to the number of training cases. This increases the risk of overfitting and poor generalizability. A recent study hypothesized that between-trial heterogeneity precluded generalizable outcome prediction in schizophrenia from being achieved.
View Article and Find Full Text PDFNeural Netw
August 2025
School of Mathematics and Statistics, The University of Melbourne, Melbourne, Parkville, VIC 3052, Australia. Electronic address:
In multivariate time series forecasting (MTSF), accurately modeling the intricate dependencies among multiple variables remains a significant challenge due to the inherent limitations of traditional approaches. Most existing models adopt either channel-independent (CI) or channel-dependent (CD) strategies, each presenting distinct drawbacks. CI methods fail to leverage the potential insights from inter-channel interactions, resulting in models that may not fully exploit the underlying statistical dependencies present in the data.
View Article and Find Full Text PDFFront Plant Sci
August 2025
Key Laboratory of Tobacco Chemistry, Zhengzhou Tobacco Research Institute of China National Tobacco Corporation (CNTC), Zhengzhou, China.
Introduction: Image and near-infrared (NIR) spectroscopic data are widely used for constructing analytical models in precision agriculture. While model interpretation can provide valuable insights for quality control and improvement, the inherent ambiguity of individual image pixels or spectral data points often hinders practical interpretability when using raw data directly. Furthermore, the presence of imbalanced datasets can lead to model overfitting and consequently, poor robustness.
View Article and Find Full Text PDFFront Endocrinol (Lausanne)
September 2025
School of Public Health, Lan Zhou University, Lanzhou, China.
Background: Type 2 diabetes mellitus (T2DM) is a common comorbidity of chronic obstructive pulmonary disease (COPD), which significantly increases the risk of rehospitalization and mortality in patients with COPD. Therefore, the purpose of this study was to identify the influencing factors of COPD complicated by T2DM and to construct a visualized disease prediction model.
Method: We included the medical records of 1,773 patients with COPD treated at Quzhou People's Hospital from 2020 to 2023.
bioRxiv
August 2025
Center for Molecular and Behavioral Neuroscience, Rutgers University, Newark, NJ, 07102.
Functional connectivity (FC) has been invaluable for understanding the brain's communication network, with strong potential for enhanced FC approaches to yield additional insights. Unlike with the fMRI field-standard method of pairwise correlation, theory suggests that partial correlation can estimate FC without confounded and indirect connections. However, partial correlation FC can also display low repeat reliability, impairing the accuracy of individual estimates.
View Article and Find Full Text PDF