Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.e., 0.8) scores in each test condition; and the human listening test showed that the average word correct rate of eight listeners was higher than 97.5%. These findings indicated that the proposed deep learning-based system can be a potential approach to synthesize a highly intelligible speech with limited envelope information in the future.

Download full-text PDF

Source
http://dx.doi.org/10.1109/EMBC48229.2022.9871247DOI Listing

Publication Analysis

Top Keywords

intelligible speech
12
temporal envelope
12
approach synthesize
8
synthesize intelligible
8
speech limited
8
envelope waveforms
8
deep learning-based
8
synthesized speech
8
speech
7
envelope
6

Similar Publications

[Speech markers as objective indicators of apathy: New insights from a case study].

Encephale

September 2025

Speech and Language Pathology Department of Nice, Faculty of Medicine, Campus Pasteur, université Côte d'Azur, 28, avenue de Valombrose, 06107 Nice, France; Cognition Behaviour Technology Laboratoy (CoBTeK), institut Claude-Pompidou, université Côte d'Azur, 10, rue Molière, 06000 Nice, France.

Introduction: Apathy, commonly observed in neurocognitive disorders, is characterized by a reduction in goal-directed behavior with a reduction of initiatives interests and emotions. This article presents the case of Mrs. B.

View Article and Find Full Text PDF

[Cough frequency monitoring: current technologies and clinical research applications].

Zhonghua Jie He He Hu Xi Za Zhi

September 2025

Department of Respiratory and Critical Care Medicine, the First Affiliated Hospital of Guangzhou Medical University, National Center for Respiratory Medicine, National Clinical Research Center for Respiratory Disease, State Key Laboratory of Respiratory Disease, Guangzhou Institute of Respiratory He

Cough is a common symptom of many respiratory diseases, and parameters such as frequency, intensity, type and duration play important roles in disease screening, diagnosis and prognosis. Among these, cough frequency is the most widely applied metric. In current clinical practice, cough severity is primarily assessed based on patients' subjective symptom descriptions in combination with semi-structured questionnaires.

View Article and Find Full Text PDF

Prior researches on global-local processing have focused on hierarchical objects in the visual modality, while the real-world involves multisensory interactions. The present study investigated whether the simultaneous presentation of auditory stimuli influences the recognition of visually hierarchical objects. We added four types of auditory stimuli to the traditional visual hierarchical letters paradigm:no sound (visual-only), a pure tone, a spoken letter that was congruent with the required response (response-congruent), or a spoken letter that was incongruent with it (response-incongruent).

View Article and Find Full Text PDF

Background: The integration of digital health care technologies into speech-language pathology and audiology is rapidly transforming service delivery. In South Africa and other low- and middle-income countries (LMICs), digital tools offer significant opportunities to address access challenges and enhance patient outcomes. However, the adoption of these technologies requires careful consideration of contextual factors.

View Article and Find Full Text PDF

Background: Effective communication and collaboration among clinical and nonclinical staff are critical to the health and safety of the staff, for optimal team performance and for safe patient care. While respiratory protective equipment are routine key strategies to protect healthcare workers from exposure to select respiratory pathogens, they have been demonstrated to disrupt speech intelligibility. The COVID-19 pandemic escalated the need for and utilization of respiratory protection in all healthcare settings.

View Article and Find Full Text PDF