98%
921
2 minutes
20
Motivation: Graph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes. For this reason, classification procedures are forced to assume that the vast majority of unlabeled edges are negative. Existing approaches to sampling negative edges for training and evaluating classifiers do so by uniformly sampling pairs of nodes.
Results: We show here that this sampling strategy typically leads to sets of positive and negative examples with imbalanced node degree distributions. Using representative heterogeneous biomedical knowledge graph and random walk-based graph machine learning, we show that this strategy substantially impacts classification performance. If users of graph machine-learning models apply the models to prioritize examples that are drawn from approximately the same distribution as the positive examples are, then performance of models as estimated in the validation phase may be artificially inflated. We present a degree-aware node sampling approach that mitigates this effect and is simple to implement.
Availability And Implementation: Our code and data are publicly available at https://github.com/monarch-initiative/negativeExampleSelection.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10994718 | PMC |
http://dx.doi.org/10.1093/bioadv/vbae036 | DOI Listing |
BMC Musculoskelet Disord
September 2025
Department of Clinical Sciences at Danderyds Hospital, Department of Orthopedic Surgery, Karolinska Institutet, Stockholm, 182 88, Sweden.
Background: This study evaluates the accuracy of an Artificial Intelligence (AI) system, specifically a convolutional neural network (CNN), in classifying elbow fractures using the detailed 2018 AO/OTA fracture classification system.
Methods: A retrospective analysis of 5,367 radiograph exams visualizing the elbow from adult patients (2002-2016) was conducted using a deep neural network. Radiographs were manually categorized according to the 2018 AO/OTA system by orthopedic surgeons.
Environ Monit Assess
September 2025
Institute of Earth Sciences, Southern Federal University, Rostov-On-Don, Russia.
Sustainable urban development requires actionable insights into the thermal consequences of land transformation. This study examines the impact of land use and land cover (LULC) changes on land surface temperature (LST) in Ho Chi Minh city, Vietnam, between 1998 and 2024. Using Google Earth Engine (GEE), three machine learning algorithms-random forest (RF), support vector machine (SVM), and classification and regression tree (CART)-were applied for LULC classification.
View Article and Find Full Text PDFMed Eng Phys
October 2025
University of Missouri, Department of Physical Therapy, Columbia, MO, USA. Electronic address:
Measurable neuromotor control deficits during functional task performance could provide objective criteria to aid in concussion diagnosis. However, many tools which measure these constructs are unidimensional and not clinically feasible. The purpose of this study was to assess the classification accuracy of a machine learning model using features measured by a clinically feasible movement-based assessment system (Mizzou Point-of-care Assessment System (MPASS) between athletes with and without concussion.
View Article and Find Full Text PDFJ Med Internet Res
September 2025
Artificial Intelligence and Mathematical Modeling Lab, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
Background: The H5N1 avian influenza A virus represents a serious threat to both animal and human health, with the potential to escalate into a global pandemic. Effective monitoring of social media during H5N1 avian influenza outbreaks could potentially offer critical insights to guide public health strategies. Social media platforms like Reddit, with their diverse and region-specific communities, provide a rich source of data that can reveal collective attitudes, concerns, and behavioral trends in real time.
View Article and Find Full Text PDFBiomed Phys Eng Express
September 2025
electrical engineering department, Indian Institute of Technology Roorkee, Research wing, electrical department, Roorkee, uttrakhand, 247664, INDIA.
Imagined speech classification involves decoding brain signals to recognize verbalized thoughts or intentions without actual speech production. This technology has significant implications for individuals with speech impairments, offering a means to communicate through neural signals. The prime objective of this work is to propose an innovative machine learning (ML) based classification methodology that combines electroencephalogram (EEG) data augmentation using a sliding window technique with statistical feature extraction from the amplitude and phase spectrum of frequency domain EEG segments.
View Article and Find Full Text PDF