Talker Differences in Perceived Emotion in Clear and Conversational Speech.

J Speech Lang Hear Res

Department of Communication Sciences and Disorders, The University of Utah, Salt Lake City.

Published: March 2025


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Purpose: Previous work has shown that judgments of emotion differ between clear and conversational speech, particularly for perceived anger. The current study examines talker differences in perceived emotion for a database of talkers producing clear and conversational speech.

Method: A database of 41 talkers was used to assess talker differences in six emotion categories ("Anger," "Fear," "Disgust," "Happiness," "Sadness," and "Neutral"). Twenty-six healthy young adult listeners rated perceived emotion in 14 emotionally neutral sentences produced in clear and conversational styles by all talkers in the database. Generalized linear mixed-effects modeling was utilized to examine talker differences in all six emotion categories.

Results: There was a significant effect of speaking style for all emotion categories, and substantial talker differences existed after controlling for speaking style in all categories. Additionally, many emotion categories, including anger, had significant Talker × Style interactions. Perceived anger was significantly higher in clear speech compared to conversational speech for 85% of the talkers.

Conclusions: While there is a large speaking style effect for perceived anger, the magnitude of the effect varies between talkers. The perception of negatively valenced emotions in clear speech, including anger, may result in unintended interpersonal consequences for those utilizing clear speech as a communication facilitator. Further research is needed to examine potential acoustic sources of perceived anger in clear speech.

Supplemental Material: https://doi.org/10.23641/asha.28304384.

Download full-text PDF

Source
http://dx.doi.org/10.1044/2024_JSLHR-24-00325DOI Listing

Publication Analysis

Top Keywords

talker differences
20
clear conversational
16
perceived anger
16
perceived emotion
12
conversational speech
12
emotion categories
12
speaking style
12
clear speech
12
differences perceived
8
emotion
8

Similar Publications

In the McGurk effect, incongruent auditory and visual syllables are perceived as a third, illusory syllable. The prevailing explanation for the effect is that the illusory syllable is a consensus percept intermediate between otherwise incompatible auditory and visual representations. To test this idea, we turned to a deep neural network known as AVHuBERT that transcribes audiovisual speech with high accuracy.

View Article and Find Full Text PDF

Simulation of unilateral and bilateral cochlear implants on spatial speech-in-noise tasks.

J Acoust Soc Am

September 2025

Audiology Department, College of Health and Life Sciences, Aston University, Birmingham, B4 7ET, United Kingdom.

The current study simulated bilateral and unilateral cochlear implant (CI) processing using a channel vocoder with dense tonal carriers ("SPIRAL") in 13 normal-hearing listeners. Their performance of recognizing spatial speech-in-noise was measured under the effects of three masker locations (0°, +90°, and -90°; target at 0°) and three types of maskers (steady-state noise, speech-modulated noise, and a single-talker interferer) where the maskers contained different levels of energetic and informational masking. The stimuli were spatialized using the head-related impulse responses recorded from behind-the-ear microphones of hearing aids.

View Article and Find Full Text PDF

While speech perception amidst competing talkers is well-studied, the perception of polyphonic music remains less explored. Pitch differences aid in source segregation, yet reductions in harmonicity have relatively little effect on speech intelligibility in such conditions. We hypothesized that source identification and segregation in music would rely more on harmonicity, given the central role of pitch in music and fewer alternative segregation cues, such as temporal incoherence.

View Article and Find Full Text PDF

: Speech perception typically takes place against a background of other speech or noise. The present study investigates the effectiveness of segregating speech streams within a competing speech signal, examining whether cues such as pitch, which typically denote a difference in talker, behave in the same way as cues such as speaking rate, which typically do not denote the presence of a new talker. : Native English speakers listened to English target speech within English two-talker babble of a similar or different pitch and/or a similar or different speaking rate to identify whether mismatched properties between target speech and masker babble improve speech segregation.

View Article and Find Full Text PDF

Talker-specificity beyond the lexicon: Recognition memory for spoken sentences.

Psychon Bull Rev

August 2025

Department of Linguistics, Stanford University, Building 460, Margaret Jacks Hall 450 Jane Stanford Way, Stanford, CA, 94305, USA.

Over the past 35 years, it has been established that mental representations of language include fine-grained acoustic details stored in episodic memory. The empirical foundations of this fact were established through a series of word recognition experiments showing that participants were better at remembering words repeated by the same talker than words repeated by a different talker (talker-specificity effect). This effect has been widely replicated, but exclusively with isolated, generally monosyllabic, words as the object of study.

View Article and Find Full Text PDF