98%
921
2 minutes
20
Background: New machine learning methods and techniques are frequently introduced in radiomics, but they are often tested on a single dataset, which makes it challenging to assess their true benefit. Currently, there is a lack of a larger, publicly accessible dataset collection on which such assessments could be performed. In this study, a collection of radiomics datasets with binary outcomes in tabular form was curated to allow benchmarking of machine learning methods and techniques.
Methods: A variety of journals and online sources were searched to identify tabular radiomics data with binary outcomes, which were then compiled into a homogeneous data collection that is easily accessible via Python. To illustrate the utility of the dataset collection, it was applied to investigate whether feature decorrelation prior to feature selection could improve predictive performance in a radiomics pipeline.
Results: A total of 50 radiomic datasets were collected, with sample sizes ranging from 51 to 969 and 101 to 11165 features. Using this data, it was observed that decorrelating features did not yield any significant improvement on average.
Conclusions: A large collection of datasets, easily accessible via Python, suitable for benchmarking and evaluating new machine learning techniques and methods was curated. Its utility was exemplified by demonstrating that feature decorrelation prior to feature selection does not, on average, lead to significant performance gains and could be omitted, thereby increasing the robustness and reliability of the radiomics pipeline.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.compbiomed.2024.109140 | DOI Listing |
NEJM AI
September 2025
Department of Bioengineering, Stanford University, Stanford, CA.
Background: Assessing human movement is essential for diagnosing and monitoring movement-related conditions like neuromuscular disorders. Timed function tests (TFTs) are among the most widespread types of assessments due to their speed and simplicity, but they cannot capture disease-specific movement patterns. Conversely, biomechanical analysis can produce sensitive disease-specific biomarkers, but it is traditionally confined to laboratory settings.
View Article and Find Full Text PDFMethodsX
December 2025
Department of Earth and Environmental Science, University of Waterloo, Waterloo, ON, Canada.
Human factors are central to aviation safety, with pilot cognitive states such as workload, stress, and situation awareness playing important roles in flight performance and safety. Although flight simulators are widely used for training and scientific research, they often lack the ecological validity needed to replicate pilot cognitive states from real flights. To address these limitations, a new in-flight data collection methodology for general aviation using a Cessna 172 aircraft, which is one of the most widely used aircraft for pilot training, is presented.
View Article and Find Full Text PDFComput Med Imaging Graph
August 2025
Beijing Huilongguan Hospital, Peking University Huilongguan Clinical Medical School, Beijing 100096, China. Electronic address:
Bipolar disorder (BD) is a debilitating mental illness characterized by significant mood swings, posing a substantial challenge for accurate diagnosis due to its clinical complexity. This paper presents CS2former, a novel approach leveraging a dual channel-spatial feature extraction module within a Transformer model to diagnose BD from resting-state functional MRI (Rs-fMRI) and T1-weighted MRI (T1w-MRI) data. CS2former employs a Channel-2D Spatial Feature Aggregation Module to decouple channel and spatial information from Rs-fMRI, while a Channel-3D Spatial Attention Module with Synchronized Attention Module (SAM) concurrently computes attention for T1w-MRI feature maps.
View Article and Find Full Text PDFGenome Biol
September 2025
Department of Clinical Pharmacy, Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
Background: Recent advances in high-throughput sequencing technologies have enabled the collection and sharing of a massive amount of omics data, along with its associated metadata-descriptive information that contextualizes the data, including phenotypic traits and experimental design. Enhancing metadata availability is critical to ensure data reusability and reproducibility and to facilitate novel biomedical discoveries through effective data reuse. Yet, incomplete metadata accompanying public omics data may hinder reproducibility and reusability and limit secondary analyses.
View Article and Find Full Text PDFJMIR Res Protoc
September 2025
Academy for Health Innovation Uganda, Infectious Diseases Institute, Makerere University, Kampala, Uganda.
Background: Sexually transmitted infections are a significant public health concern, particularly in sub-Saharan Africa, where their prevalence remains high. Promoting awareness and reducing stigma are essential strategies for addressing this challenge, but those affected often have limited access to accurate and culturally appropriate health information. Therefore, innovative solutions are essential to enhance sexual health literacy and encourage informed health-seeking behaviors.
View Article and Find Full Text PDF