How to Design a Relevant Corpus for Sleepiness Detection Through Voice?

Vincent P Martin , Jean-Luc Rouas , Jean-Arthur Micoulaud-Franchi , Pierre Philip , Jarek Krajewski

Front Digit Health

Engineering Psychology, Rhenish University of Applied Science, Cologne, Germany.

Published: September 2021

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

This article presents research on the detection of pathologies affecting speech through automatic analysis. Voice processing has indeed been used for evaluating several diseases such as Parkinson, Alzheimer, or depression. If some studies present results that seem sufficient for clinical applications, this is not the case for the detection of sleepiness. Even two international challenges and the recent advent of deep learning techniques have still not managed to change this situation. This article explores the hypothesis that the observed average performances of automatic processing find their cause in the design of the corpora. To this aim, we first discuss and refine the concept of related to the ground-truth labels. Second, we present an in-depth study of four corpora, bringing to light the methodological choices that have been made and the underlying biases they may have induced. Finally, in light of this information, we propose guidelines for the design of new corpora.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8521834	PMC
http://dx.doi.org/10.3389/fdgth.2021.686068	DOI Listing

Publication Analysis

Top Keywords

design corpora

design relevant

relevant corpus

corpus sleepiness

sleepiness detection

detection voice?

voice? article

article presents

presents detection

detection pathologies

Similar Publications

Multimodal deep learning methods for speech and language rehabilitation: a cross-sectional observational study.

Disabil Rehabil Assist Technol

September 2025

School of Foreign Languages, Ningbo University of Technology, Ningbo, China.

Xinqiao Cen

The speech and language rehabilitation are essential to people who have disorders of communication that may occur due to the condition of neurological disorder, developmental delays, or bodily disabilities. With the advent of deep learning, we introduce an improved multimodal rehabilitation pipeline that incorporates audio, video, and text information in order to provide patient-tailored therapy that adapts to the patient. The technique uses a cross-attention fusion multimodal hierarchical transformer architectural model that allows it to jointly design speech acoustics as well as the facial dynamics, lip articulation, and linguistic context.

View Article and Find Full Text PDF

Similar Publications

Can Large Language Models Simulate Spoken Human Conversations?

Cogn Sci

September 2025

Institute of Work and Organizational Psychology, University of Neuchâtel.

Eric Mayor , Lucas M Bietti , Adrian Bangerter

Large language models (LLMs) can emulate many aspects of human cognition and have been heralded as a potential paradigm shift. They are proficient in chat-based conversation, but little is known about their ability to simulate spoken conversation. We investigated whether LLMs can simulate spoken human conversation.

View Article and Find Full Text PDF

Similar Publications

scOTM: A Deep Learning Framework for Predicting Single-Cell Perturbation Responses with Large Language Models.

Bioengineering (Basel)

August 2025

Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR 999077, China.

Yuchen Wang , Tianchi Lu , Xingjian Chen , Zhongyu Yao , Ka-Chun Wong

Modeling drug-induced transcriptional responses at the single-cell level is essential for advancing human healthcare, particularly in understanding disease mechanisms, assessing therapeutic efficacy, and anticipating adverse effects. However, existing approaches often impose a rigid constraint by enforcing pointwise alignment of latent representations to a standard normal prior, which limits expressiveness and results in biologically uninformative embeddings, especially in complex biological systems. Additionally, many methods inadequately address the challenges of unpaired data, typically relying on naive averaging strategies that ignore cell-type specificity and intercellular heterogeneity.

View Article and Find Full Text PDF

Similar Publications

Relative trajectories of economic and ecological activities during industrialization in the United Kingdom and China reveal a sustainable transition with frequent degradations.

J Environ Manage

August 2025

Computational Communication Research Center, Beijing Normal University, Zhuhai, 519087, China. Electronic address:

Xiaoyu Hou , Tianyi Zhou , Xianyuan Chang , Zhaoping Wu , Feng Mao

Understanding the long-term relationship between economic and ecological activities is crucial for sustainability research, yet quantitative evidence of its evolution remains limited. This study applies high-throughput culturomics analysis to newspaper corpora, analysing ecological and economic words from newspapers spanning more than two centuries in the United Kingdom (an early industrializer) and over a century in China (a latecomer). We quantify the occurrence, diversity, and relative dynamics of ecological and economic words since industrialization, revealing shifts in subcategory composition and relative dominance.

View Article and Find Full Text PDF

Similar Publications

Improving automated deep phenotyping through large language models using retrieval-augmented generation.

Genome Med

August 2025

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.

Brandon T Garcia , Lauren Westerfield , Priya Yelemali , Nikhita Gogate , E Andres Rivera-Munoz

Background: Diagnosing rare genetic disorders relies on precise phenotypic and genotypic analysis, with the Human Phenotype Ontology (HPO) providing a standardized language for capturing clinical phenotypes. Rule-based HPO extraction tools use concept recognition to automatically identify phenotypes, but they often struggle with incomplete phenotype assignment, requiring significant manual review. While large language models (LLMs) hold promise for more context-driven phenotype extraction, they are prone to errors and "hallucinations," making them less reliable without further refinement.

View Article and Find Full Text PDF

Similar Publications