Multitalker speech perception with ideal time-frequency segregation: effects of voice characteristics and number of talkers.

J Acoust Soc Am

Air Force Research Laboratory, Human Effectiveness Directorate, Wright-Patterson AFB, Ohio 45433, USA.

Published: June 2009


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

When a target voice is masked by an increasingly similar masker voice, increases in energetic masking are likely to occur due to increased spectro-temporal overlap in the competing speech waveforms. However, the impact of this increase may be obscured by informational masking effects related to the increased confusability of the target and masking utterances. In this study, the effects of target-masker similarity and the number of competing talkers on the energetic component of speech-on-speech masking were measured with an ideal time-frequency segregation (ITFS) technique that retained all the target-dominated time-frequency regions of a multitalker mixture but eliminated all the time-frequency regions dominated by the maskers. The results show that target-masker similarity has a small but systematic impact on energetic masking, with roughly a 1 dB release from masking for same-sex maskers versus same-talker maskers and roughly an additional 1 dB release from masking for different-sex masking voices. The results of a second experiment measuring ITFS performance with up to 18 interfering talkers indicate that energetic masking increased systematically with the number of competing talkers. These results suggest that energetic masking differences related to target-masker similarity have a much smaller impact on multitalker listening performance than energetic masking effects related to the number of competing talkers in the stimulus and non-energetic masking effects related to the confusability of the target and masking voices.

Download full-text PDF

Source
http://dx.doi.org/10.1121/1.3117686DOI Listing

Publication Analysis

Top Keywords

energetic masking
20
masking
13
masking effects
12
target-masker similarity
12
number competing
12
competing talkers
12
ideal time-frequency
8
time-frequency segregation
8
confusability target
8
target masking
8

Similar Publications

Simulation of unilateral and bilateral cochlear implants on spatial speech-in-noise tasks.

J Acoust Soc Am

September 2025

Audiology Department, College of Health and Life Sciences, Aston University, Birmingham, B4 7ET, United Kingdom.

The current study simulated bilateral and unilateral cochlear implant (CI) processing using a channel vocoder with dense tonal carriers ("SPIRAL") in 13 normal-hearing listeners. Their performance of recognizing spatial speech-in-noise was measured under the effects of three masker locations (0°, +90°, and -90°; target at 0°) and three types of maskers (steady-state noise, speech-modulated noise, and a single-talker interferer) where the maskers contained different levels of energetic and informational masking. The stimuli were spatialized using the head-related impulse responses recorded from behind-the-ear microphones of hearing aids.

View Article and Find Full Text PDF

In multi-source environments, rhythmic regularities in both to-be-attended signals (targets), as well as to-be-ignored signals (backgrounds) have been found to influence selective listening across a variety of stimuli and listening conditions. Specifically, regular rhythmic structures facilitate recognition of target signals, and background signals with regular rhythmic structures are more effective maskers than irregular backgrounds. The current study focused on the background rhythm effect and assessed to what degree it depends on the perceptual similarity between the target and background signals, and its dependence on listener age.

View Article and Find Full Text PDF

Four-dimensional (4D) flow MRI has shown promise for the assessment of aortic hemodynamics. However, data analysis traditionally requires manual and time-consuming human input at several stages. This limits reproducibility and affects analysis workflows, such that large-cohort 4D flow studies are lacking.

View Article and Find Full Text PDF

Speech recognition under masking: Age, hearing, and machine learning classification.

Acta Psychol (Amst)

September 2025

Disability Research Division, Department of Behavioural Sciences and Learning, Linköping University, Linköping, Sweden.

In this study, we seek to empirically evaluate whether maskers can be categorically grouped into energetic and informational using machine learning classification techniques. The study further aimed to examine how age and hearing ability affect speech reception thresholds (SRTs) using different speech materials and masker types (energetic vs. informational).

View Article and Find Full Text PDF

Filling the blanks of checkerboard speech with noise: Evidence for phonemic restoration and maskinga).

J Acoust Soc Am

August 2025

Department of Acoustic Design, Faculty of Design/Research Center for Applied Perceptual Science, Kyushu University, 4-9-1 Shiobaru, Minami-ku, Fukuoka 815-8540, Japan.

Sixteen-band checkerboard speech (interrupted in time and frequency) is perfectly intelligible. Whereas two- and four-band checkerboard speech is usually less intelligible than speech interrupted only in time [Ueda et al., J.

View Article and Find Full Text PDF