98%
921
2 minutes
20
Purpose: The present study investigated the intelligibility of digitized and synthesized speech output in background noise for children 3-5 years old. The purpose of the study was to determine whether there was a difference in the intelligibility (ability to repeat) of 3 types of speech output (digitized, DECTalk synthesized, and MacinTalk synthesized) in single words and sentences, presented within and out of context.
Method: The dependent variable was speech intelligibility (number of individual words repeated correctly). The study used a mixed-model design. Ninety typically developing children (3-5 years old) were assigned to each of 3 speech type conditions. Participants were asked to repeat 20 words and 10 short sentences. Half of the stimuli were preceded by contextual information (topic cue), and half were presented without any context.
Results: Young children have difficulty accurately repeating some digitized and synthesized messages in background noise. Overall, the older children (4- and 5-year-olds) performed better than the 3-year-old children. Increasing information through context or longer messages (i.e., sentences) did facilitate intelligibility overall, although there was a statistically significant Message Length x Context x Speech Type interaction.
Conclusions: For 3-5-year-olds, the intelligibility of single words is very low (55%-77%). The intelligibility of sentences is higher, but the sole use of sentences for communication is problematic. Contextual information facilitates intelligibility and is a promising approach for ensuring effective communication. Future research is needed to improve the intelligibility of speech output at the single word level in order to maximize the benefits of speech output.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1044/1058-0360(2006/015) | DOI Listing |
IEEE Trans Pattern Anal Mach Intell
September 2025
In this paper, we propose a novel framework, Combo, for harmonious co-speech holistic 3D human motion generation and efficient customizable adaption. In particular, we identify that one fundamental challenge as the multiple-input-multiple-output (MIMO) nature of the generative model of interest. More concretely, on the input end, the model typically consumes both speech signals and character guidance (e.
View Article and Find Full Text PDFDisabil Rehabil Assist Technol
September 2025
Department of Health Sciences, European University Cyprus, Nicosia, Cyprus.
We examined the concurrent change in developmental language phase (DLP) and linguistic status of children with Autism Spectrum Disorder (ASD)/autism, identified as Nonverbal/Minimally-Verbal (NV/MV), utilizing Augmentative/Alternative Communication (AAC) systems. We compared the linguistic output of NV/MV autistic children concurrently, with and without use of AAC systems. Additionally, we compared the linguistic level, characteristics, and early developmental milestones for AAC users and non-users.
View Article and Find Full Text PDFPsychophysiology
September 2025
Social, Economic and Organisational Psychology, Leiden University, Leiden, the Netherlands.
People may feel stressed when engaging with contentious topics, such as migration. However, when individuals learn that their opinion-based ingroup is growing or shrinking, they may experience this stress in different ways, namely as a threat or a challenge. In a preregistered study (N = 203 Dutch university students), we examined among host society members how progressive and conservative changes (vs.
View Article and Find Full Text PDFIEEE Trans Audio Speech Lang Process (2025)
April 2025
Department of Electronic Engineering and with the Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile.
This study presents a novel application of a Probabilistic Bayesian Neural Network (PBNN) for estimating vocal function variables and enhancing non-invasive ambulatory voice monitoring by addressing aleatoric and epistemic uncertainties in regression tasks. The proposed PBNN allows for estimating key physiological parameters including subglottal pressure, vocal fold contact pressure, thyroarytenoid, and cricothyroid muscle activations, from seven aerodynamic and acoustic features. The PBNN is trained on the Triangular Body-Cover Model (TBCM) of the vocal folds to produce a non-linear inverse mapping between its inputs and outputs.
View Article and Find Full Text PDFSensors (Basel)
August 2025
School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea.
Most speech separation techniques require knowing the number of talkers mixed in an input, which is not always available in real situations. To address this problem, we present a novel speech separation method that automatically finds the number of talkers in input mixture recordings. The proposed method extracts the voices of individual talkers one by one in a deflationary manner and stops the extraction sequence when a predefined termination criterion is satisfied.
View Article and Find Full Text PDF