Effects of Spectral Degradation on Attentional Modulation of Cortical Auditory Responses to Continuous Speech.

J Assoc Res Otolaryngol

College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Zhejiang, China.

Published: December 2015


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

This study investigates the effect of spectral degradation on cortical speech encoding in complex auditory scenes. Young normal-hearing listeners were simultaneously presented with two speech streams and were instructed to attend to only one of them. The speech mixtures were subjected to noise-channel vocoding to preserve the temporal envelope and degrade the spectral information of speech. Each subject was tested with five spectral resolution conditions (unprocessed speech, 64-, 32-, 16-, and 8-channel vocoder conditions) and two target-to-masker ratio (TMR) conditions (3 and 0 dB). Ongoing electroencephalographic (EEG) responses and speech comprehension were measured in each spectral and TMR condition for each subject. Neural tracking of each speech stream was characterized by cross-correlating the EEG responses with the envelope of each of the simultaneous speech streams at different time lags. Results showed that spectral degradation and TMR both significantly influenced how top-down attention modulated the EEG responses to the attended and unattended speech. That is, the EEG responses to the attended and unattended speech streams differed more for the higher (unprocessed, 64 ch, and 32 ch) than the lower (16 and 8 ch) spectral resolution conditions, as well as for the higher (3 dB) than the lower TMR (0 dB) condition. The magnitude of differential neural modulation responses to the attended and unattended speech streams significantly correlated with speech comprehension scores. These results suggest that severe spectral degradation and low TMR hinder speech stream segregation, making it difficult to employ top-down attention to differentially process different speech streams.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4636590PMC
http://dx.doi.org/10.1007/s10162-015-0540-xDOI Listing

Publication Analysis

Top Keywords

speech streams
20
spectral degradation
16
eeg responses
16
speech
15
responses attended
12
attended unattended
12
unattended speech
12
spectral resolution
8
resolution conditions
8
speech comprehension
8

Similar Publications

Objectives: In recent years, there has been a profound increase in the use of remote online communication as a supplement to, and in many cases a replacement for, in-person interactions. While online communication tools hold potential to improve accessibility, previous studies have suggested that increased reliance on remote communication poses additional challenges for people with hearing loss, including those with a cochlear implant (CI). This study aimed to investigate the preferences and speech-reception performance of adults with a CI during online communication.

View Article and Find Full Text PDF

: Speech perception typically takes place against a background of other speech or noise. The present study investigates the effectiveness of segregating speech streams within a competing speech signal, examining whether cues such as pitch, which typically denote a difference in talker, behave in the same way as cues such as speaking rate, which typically do not denote the presence of a new talker. : Native English speakers listened to English target speech within English two-talker babble of a similar or different pitch and/or a similar or different speaking rate to identify whether mismatched properties between target speech and masker babble improve speech segregation.

View Article and Find Full Text PDF

Bangla Speech Emotion Recognition Using Deep Learning-Based Ensemble Learning and Feature Fusion.

J Imaging

August 2025

Centre for Image and Vision Computing (CIVC), COE for Artificial Intelligence, Faculty of Artificial Intelligence and Engineering (FAIE), Multimedia University, Cyberjaya 63100, Selangor, Malaysia.

Emotion recognition in speech is essential for enhancing human-computer interaction (HCI) systems. Despite progress in Bangla speech emotion recognition, challenges remain, including low accuracy, speaker dependency, and poor generalization across emotional expressions. Previous approaches often rely on traditional machine learning or basic deep learning models, struggling with robustness and accuracy in noisy or varied data.

View Article and Find Full Text PDF

A lightweight ECA-based DCNN approach for speech command recognition.

Comput Biol Med

August 2025

Dept. of ECE, Mepco Schlenk Engineering College, Sivakasi, Tamil Nadu, India. Electronic address:

Background: Speech recognition allows the recognition of audio streams. It is a tool used by professionals across a range of industries that require accurate transcriptions. In the context of authentication, speech recognition can be used as a biometric factor to verify a user's identity and can be incredibly helpful for individuals with disabilities, particularly those with speech impairments.

View Article and Find Full Text PDF

Concurrent vowel perception experiments have revealed the importance of fundamental frequency (f0) differences in speech stream segregation. Understanding neural processes that support speech streaming using f0 differences remains an active area of perceptual and neurocomputational modeling research. This study simultaneously measured subcortical neural encoding [frequency following responses (FFRs)] and cued vowel identification accuracy of 12 concurrent vowel mixtures with large f0 differences (>8 semitones) to assess whether f0-based neural channel selection predicted perception.

View Article and Find Full Text PDF