98%
921
2 minutes
20
Data collection, curation, and cleaning constitute a crucial phase in Machine Learning (ML) projects. In biomedical ML, it is often desirable to leverage multiple datasets to increase sample size and diversity, but this poses unique challenges, which arise from heterogeneity in study design, data descriptors, file system organization, and metadata. In this study, we present an approach to the integration of multiple brain MRI datasets with a focus on homogenization of their organization and preprocessing for ML. We use our own fusion example (approximately 84,000 images from 54,000 subjects, 12 studies, and 88 individual scanners) to illustrate and discuss the issues faced by study fusion efforts, and we examine key decisions necessary during dataset homogenization, presenting in detail a database structure flexible enough to accommodate multiple observational MRI datasets. We believe our approach can provide a basis for future similarly-minded biomedical ML projects.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11026619 | PMC |
http://dx.doi.org/10.3389/fradi.2024.1283392 | DOI Listing |
JAMA Netw Open
September 2025
Division of Gastroenterology, Department of Medicine, University of California San Diego, La Jolla.
Importance: Janus kinase (JAK) inhibitors are highly effective medications for several immune-mediated inflammatory diseases (IMIDs). However, safety concerns have led to regulatory restrictions.
Objective: To compare the risk of adverse events with JAK inhibitors vs tumor necrosis factor (TNF) antagonists in patients with IMIDs in head-to-head comparative effectiveness studies.
Acta Psychiatr Scand
September 2025
Department of Clinical Medicine, Aarhus University, Aarhus, Denmark.
Introduction: Machine learning studies sometimes include a high number of predictors relative to the number of training cases. This increases the risk of overfitting and poor generalizability. A recent study hypothesized that between-trial heterogeneity precluded generalizable outcome prediction in schizophrenia from being achieved.
View Article and Find Full Text PDFCochrane Database Syst Rev
September 2025
Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.
Background: Radiotherapy is the mainstay of treatment for head and neck cancer (HNC) but may induce various side effects on surrounding normal tissues. To reach an optimal balance between tumour control and toxicity prevention, normal tissue complication probability (NTCP) models have been reported to predict the risk of radiation-induced side effects in patients with HNC. However, the quality of study design, conduct, and analysis (i.
View Article and Find Full Text PDFStroke
September 2025
Brain Language Laboratory, Freie Universität Berlin, Germany (A.-T.P.J., M.R.O., A.S., F.P.).
Background: Intensive language-action therapy treats language deficits and depressive symptoms in chronic poststroke aphasia, yet the underlying neural mechanisms remain underexplored. Long-range temporal correlations (LRTCs) in blood oxygenation level-dependent signals indicate persistence in brain activity patterns and may relate to learning and levels of depression. This observational study investigates blood oxygenation level-dependent LRTC changes alongside therapy-induced language and mood improvements in perisylvian and domain-general brain areas.
View Article and Find Full Text PDFCirc Cardiovasc Interv
September 2025
Division of Cardiology, Department of Medicine, Loyola University Medical Center and Loyola Stritch School of Medicine, Maywood, IL.