A foundational transformer leveraging full night, multichannel sleep study data accurately classifies sleep stages.

Benjamin Fox , Joy Jiang , Sajila Wickramaratne , Patricia Kovatch , Mayte Suarez-Farinas , Neomi A Shah , Ankit Parekh , Girish N Nadkarni

medRxiv

The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.

Published: August 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Study Objectives: To investigate whether a foundational transformer model using 8-hour, multichannel data from polysomnograms can outperform existing artificial intelligence (AI) methods for sleep stage classification.

Methods: We utilized the Sleep Heart Health Study (SHHS) visits 1 and 2 for training and validation and the Multi-Ethnic Study of Atherosclerosis (MESA) for testing of our model. We trained a self-supervised foundational transformer (called PFTSleep) that encodes 8-hour long sleep studies at 125 Hz with 7 signals including brain, movement, cardiac, oxygen, and respiratory channels. These encodings are used as input for training of an additional model to classify sleep stages, without adjusting the weights of the foundational transformer. We compared our results to existing AI methods that did not utilize 8-hour data or the full set of signals but did report evaluation metrics for the SHHS dataset.

Results: We trained and validated a model with 8,444 sleep studies with 7 signals including brain, movement, cardiac, oxygen, and respiratory channels and tested on an additional 2,055 studies. In total, we trained and tested 587,944 hours of sleep study signal data. Area under the precision recall curve (AUPRC) scores were 0.82, 0.40, 0.53, 0.75, and 0.82 and area under the receiving operating characteristics curve (AUROC) scores were 0.99, 0.95, 0.96, 0.98, and 0.99 for wake, N1, N2, N3, and REM, respectively, on the SHHS validation set. For MESA, the AUPRC scores were 0.56, 0.16, 0.40, 0.45, and 0.65 and AUROC scores were 0.94, 0.77, 0.87, 0.91, and 0.96, respectively. Our model was compared to the longest context window state-of-the-art model and showed increases in macro evaluation scores, notably sensitivity (3.7% increase) and multi-class REM (3.39% increase) and wake (0.97% increase) F1 scores.

Conclusions: Utilizing full night, multi-channel PSG data encodings derived from a foundational transformer improve sleep stage classification over existing methods.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11326349	PMC
http://dx.doi.org/10.1101/2024.08.02.24311417	DOI Listing

Publication Analysis

Top Keywords

foundational transformer

sleep

full night

sleep study

sleep stages

sleep stage

sleep studies

signals including

including brain

brain movement

Similar Publications

Temporal Modeling With Frozen Vision-Language Foundation Models for Parameter-Efficient Text-Video Retrieval.

IEEE Trans Neural Netw Learn Syst

September 2025

Leqi Shen , Tianxiang Hao , Tao He , Yifeng Zhang , Pengzhang Liu

Temporal modeling plays an important role in the effective adaption of the powerful pretrained text-image foundation model into text-video retrieval. However, existing methods often rely on additional heavy trainable modules, such as transformer or BiLSTM, which are inefficient. In contrast, we avoid introducing such heavy components by leveraging frozen foundation models.

View Article and Find Full Text PDF

Similar Publications

FmH2ST: foundation model-based spatial transcriptomics generation from histological images.

Nucleic Acids Res

September 2025

School of Software, Shandong University, Jinan 250101, Shandong, China.

Yuequn Wang , Jun Wang , Yanyu Xu , Ning Liu , Bin Liu

Spatial transcriptomics (ST) reveals gene expression distributions within tissues. Yet, predicting spatial gene expression from histological images still faces the challenges of limited ST data that lack prior knowledge, and insufficient capturing of inter-slice heterogeneity and intra-slice complexity. To tackle these challenges, we introduce FmH2ST, a foundation model-based method for spatial gene expression prediction.

View Article and Find Full Text PDF

Similar Publications

Multimodal self-supervised retinal vessel segmentation.

Neural Netw

September 2025

Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China. Electronic address:

Pengshuai Yin , Jingqi Zhang , Huichou Huang , Ruirui Liu , Yanxia Liu

Automatic segmentation of retinal vessels from retinography images is crucial for timely clinical diagnosis. However, the high cost and specialized expertise required for annotating medical images often result in limited labeled datasets, which constrains the full potential of deep learning methods. Recent advances in self-supervised pretraining using unlabeled data have shown significant benefits for downstream tasks.

View Article and Find Full Text PDF

Similar Publications

A foundation model for learning genetic associations from brain imaging phenotypes.

Bioinform Adv

August 2025

IBM Research, Yorktown Heights, NY, 10598, United States.

Diego Machado Reyes , Myson Burch , Laxmi Parida , Aritra Bose

Motivation: Due to the intricate etiology of neurological disorders, finding interpretable associations between multiomics features can be challenging using standard approaches.

Results: We propose COMICAL, a contrastive learning approach using multiomics data to generate associations between genetic markers and brain imaging-derived phenotypes. COMICAL jointly learns omics representations utilizing transformer-based encoders with custom tokenizers.

View Article and Find Full Text PDF

Similar Publications

DeepPhosPPI: a deep learning framework with attention-CNN and transformer for predicting phosphorylation effects on protein-protein interactions.

Brief Bioinform

September 2025

College of Computing and Data Science, Nanyang Technological University, 639798, Singapore.

Yinyin Gong , Rui Li , Yan Liu , Jilong Wang , Danny Z Chen

Protein phosphorylation regulates protein function and cellular signaling pathways, and is strongly associated with diseases, including neurodegenerative disorders and cancer. Phosphorylation plays a critical role in regulating protein activity and cellular signaling by modulating protein-protein interactions (PPIs). It alters binding affinities and interaction networks, thereby influencing biological processes and maintaining cellular homeostasis.

View Article and Find Full Text PDF

Similar Publications