98%
921
2 minutes
20
The normalized multichannel frequency-domain least-mean square (NMCFLMS) algorithm is a prominent method for blind identification of multichannel acoustic systems. However, the NMCFLMS algorithm relies on a constant, determined by a block of microphone signals, to define the regularization parameter. This setup makes the algorithm sensitive to variations in speech segments and noise conditions. In this paper, we propose a variable regularization parameter that incorporates key factors, such as signal-to-noise ratio, output signal power, and filter length, to enhance the robustness of the algorithm against additive noise and the non-stationary nature of speech. Additionally, we introduce a mechanism to update the regularization parameter based on the mean-squared error of the adaptive filter, improving the ability of the algorithm to track time-varying systems. The proposed variable regularization NMCFLMS algorithm is then applied to speech dereverberation using the multichannel input-output inverse theorem method. Simulation results, using room impulse responses measured in real acoustic environments, demonstrate the effectiveness of the approach in both multichannel blind identification and speech dereverberation.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1121/10.0037195 | DOI Listing |
Comput Speech Lang
January 2025
The Ohio State University, 281 W Lane Ave, Columbus, 43210 OH, United States.
Deep learning has led to dramatic performance improvements for the task of speech enhancement, where deep neural networks (DNNs) are trained to recover clean speech from noisy and reverberant mixtures. Most of the existing DNN-based algorithms operate in the frequency domain, as time-domain approaches are believed to be less effective for speech dereverberation. In this study, we employ two DNNs: ARN (attentive recurrent network) and DC-CRN (densely-connected convolutional recurrent network), and systematically investigate the effects of different components on enhancement performance, such as window sizes, loss functions, and feature representations.
View Article and Find Full Text PDFJ Acoust Soc Am
July 2025
School of Information and Control Engineering and Robot Technology Used for Special Environment Key Laboratory of Sichuan province, Southwest University of Science and Technology, Mianyang 621010, China.
The normalized multichannel frequency-domain least-mean square (NMCFLMS) algorithm is a prominent method for blind identification of multichannel acoustic systems. However, the NMCFLMS algorithm relies on a constant, determined by a block of microphone signals, to define the regularization parameter. This setup makes the algorithm sensitive to variations in speech segments and noise conditions.
View Article and Find Full Text PDFNeural Netw
September 2025
National Engineering Research Center of Speech and Language Information Processing, University of Science and Technology of China, Hefei, China. Electronic address:
Phase information has a significant impact on speech perceptual quality and intelligibility. However, existing speech enhancement methods encounter limitations in explicit phase estimation due to the non-structural nature and wrapping characteristics of the phase, leading to a bottleneck in enhanced speech quality. To overcome the above issue, in this paper, we proposed MP-SENet, a novel Speech Enhancement Network that explicitly enhances Magnitude and Phase spectra in parallel.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Andrew and Erna Viterbi Faculty of Electrical & Computer Engineering, Technion-Israel Institute of Technology, Haifa 3200003, Israel.
Deep learning has revolutionized speech enhancement, enabling impressive high-quality noise reduction and dereverberation. However, state-of-the-art methods often demand substantial computational resources, hindering their deployment on edge devices and in real-time applications. Computationally efficient approaches like deep filtering and Deep Filter Net offer an attractive alternative by predicting linear filters instead of directly estimating the clean speech.
View Article and Find Full Text PDFPLoS One
July 2024
Department of Electrical Engineering, MCS, NUST, Islamabad, Pakistan.
Speech enhancement is crucial both for human and machine listening applications. Over the last decade, the use of deep learning for speech enhancement has resulted in tremendous improvement over the classical signal processing and machine learning methods. However, training a deep neural network is not only time-consuming; it also requires extensive computational resources and a large training dataset.
View Article and Find Full Text PDF