Accuracy of repetition of digitized and synthesized speech for young children in background noise.

Kathryn D R Drager , Elizabeth A Clark-Serpentine , Kate E Johnson , Jennifer L Roeser

Am J Speech Lang Pathol

Department of Communication Sciences and Disorders, Penn State University, 110 Moore Building, University Park, PA 16802, USA.

Published: May 2006

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Purpose: The present study investigated the intelligibility of digitized and synthesized speech output in background noise for children 3-5 years old. The purpose of the study was to determine whether there was a difference in the intelligibility (ability to repeat) of 3 types of speech output (digitized, DECTalk synthesized, and MacinTalk synthesized) in single words and sentences, presented within and out of context.

Method: The dependent variable was speech intelligibility (number of individual words repeated correctly). The study used a mixed-model design. Ninety typically developing children (3-5 years old) were assigned to each of 3 speech type conditions. Participants were asked to repeat 20 words and 10 short sentences. Half of the stimuli were preceded by contextual information (topic cue), and half were presented without any context.

Results: Young children have difficulty accurately repeating some digitized and synthesized messages in background noise. Overall, the older children (4- and 5-year-olds) performed better than the 3-year-old children. Increasing information through context or longer messages (i.e., sentences) did facilitate intelligibility overall, although there was a statistically significant Message Length x Context x Speech Type interaction.

Conclusions: For 3-5-year-olds, the intelligibility of single words is very low (55%-77%). The intelligibility of sentences is higher, but the sole use of sentences for communication is problematic. Contextual information facilitates intelligibility and is a promising approach for ensuring effective communication. Future research is needed to improve the intelligibility of speech output at the single word level in order to maximize the benefits of speech output.

Download full-text PDF	Source
http://dx.doi.org/10.1044/1058-0360(2006/015)	DOI Listing

Publication Analysis

Top Keywords

speech output

digitized synthesized

background noise

speech

synthesized speech

young children

purpose study

intelligibility

children 3-5

3-5 years

Similar Publications

Combo: Co-speech holistic 3D human motion generation and efficient customizable adaptation in harmony.

IEEE Trans Pattern Anal Mach Intell

September 2025

Chao Xu , Mingze Sun , Zhi-Qi Cheng , Fei Wang , Yang Liu

In this paper, we propose a novel framework, Combo, for harmonious co-speech holistic 3D human motion generation and efficient customizable adaption. In particular, we identify that one fundamental challenge as the multiple-input-multiple-output (MIMO) nature of the generative model of interest. More concretely, on the input end, the model typically consumes both speech signals and character guidance (e.

View Article and Find Full Text PDF

Similar Publications

Concurrent developmental language level change for children with autism spectrum disorder using alternative and augmentative communication systems: a cross-sectional study in Cyprus.

Disabil Rehabil Assist Technol

September 2025

Department of Health Sciences, European University Cyprus, Nicosia, Cyprus.

Margarita Kilili-Lesta , Louiza Voniati

We examined the concurrent change in developmental language phase (DLP) and linguistic status of children with Autism Spectrum Disorder (ASD)/autism, identified as Nonverbal/Minimally-Verbal (NV/MV), utilizing Augmentative/Alternative Communication (AAC) systems. We compared the linguistic output of NV/MV autistic children concurrently, with and without use of AAC systems. Additionally, we compared the linguistic level, characteristics, and early developmental milestones for AAC users and non-users.

View Article and Find Full Text PDF

Similar Publications

Leftists and Rightists Differ in Their Cardiovascular Responses to Changing Public Opinion on Migration.

Psychophysiology

September 2025

Social, Economic and Organisational Psychology, Leiden University, Leiden, the Netherlands.

Feiteng Long , Ruthie Pliskin , Daan Scheepers

People may feel stressed when engaging with contentious topics, such as migration. However, when individuals learn that their opinion-based ingroup is growing or shrinking, they may experience this stress in different ways, namely as a threat or a challenge. In a preregistered study (N = 203 Dutch university students), we examined among host society members how progressive and conservative changes (vs.

View Article and Find Full Text PDF

Similar Publications

Estimation of Physiological Vocal Features from Neck Surface Acceleration Signals Using Probabilistic Bayesian Neural Networks.

IEEE Trans Audio Speech Lang Process (2025)

April 2025

Department of Electronic Engineering and with the Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile.

Joaquín Sepúlveda , Jesús A Parra , Emiro J Ibarra , Mauricio Araya , Patricio De La Cuadra

This study presents a novel application of a Probabilistic Bayesian Neural Network (PBNN) for estimating vocal function variables and enhancing non-invasive ambulatory voice monitoring by addressing aleatoric and epistemic uncertainties in regression tasks. The proposed PBNN allows for estimating key physiological parameters including subglottal pressure, vocal fold contact pressure, thyroarytenoid, and cricothyroid muscle activations, from seven aerodynamic and acoustic features. The PBNN is trained on the Triangular Body-Cover Model (TBCM) of the vocal folds to produce a non-linear inverse mapping between its inputs and outputs.

View Article and Find Full Text PDF

Similar Publications

Deflationary Extraction Transformer for Speech Separation with Unknown Number of Talkers.

Sensors (Basel)

August 2025

School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of Korea.

Sangwon Lee , Han-Gyu Kim , Gil-Jin Jang

Most speech separation techniques require knowing the number of talkers mixed in an input, which is not always available in real situations. To address this problem, we present a novel speech separation method that automatically finds the number of talkers in input mixture recordings. The proposed method extracts the voices of individual talkers one by one in a deflationary manner and stops the extraction sequence when a predefined termination criterion is satisfied.

View Article and Find Full Text PDF

Similar Publications