Echo-aware room impulse response generation.

J Acoust Soc Am

School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea.

Published: July 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

In real-time applications, like interactive virtual reality environments, there is a significant need for low-complexity simulation of room impulse responses in highly complex virtual scenes, but this remains a challenging issue. In particular, simulating late reverberation using physically based acoustic modeling requires much computational effort, contrary to the early reflections that can be modeled by simpler techniques, e.g., the image source method. To tackle this computational complexity issue, we propose a neural network-based hybrid artificial reverberation framework (Echo2Reverb) that generates late reverberation from given early reflections. The proposed model can control both temporal texture and frequency-dependent energy decay, i.e., echo density and spectral energy distribution, of the generated reverberations by extracting spectral and echo-related features and filtering sampled sparse sequences and Gaussian noises using estimated features. To support the end-to-end training with controlled echo density, a differentiable approximation of the normalized echo density profile is proposed. We train and test the model not only for nearly diffuse but also distinct echoes prominent in late reverberations, such as with flutter echoes in narrow corridors. Evaluation results demonstrate that the proposed model can accurately reproduce frequency-dependent energy decay and temporal texture of a room impulse response using only early reflections.

Download full-text PDF

Source
http://dx.doi.org/10.1121/10.0027931DOI Listing

Publication Analysis

Top Keywords

room impulse
12
early reflections
12
echo density
12
impulse response
8
late reverberation
8
proposed model
8
temporal texture
8
frequency-dependent energy
8
energy decay
8
echo-aware room
4

Similar Publications

Background: Cognitive disorders associated with addictive disorders are well established in the literature for numerous substances and behaviours. Very few studies have examined the effect of polydrug use on cognitive functioning. These studies have focused on the cognitive effect of one substance among others in very small samples.

View Article and Find Full Text PDF

Aims: Lipoprotein(a) [Lp(a)] is a causal risk factor for atherosclerotic cardiovascular disease (ASCVD); however, the relationship between Lp(a) and the capacity for vascular repair remains unclear. Depletion of vascular regenerative (VR) progenitor cells has been shown to be a novel indicator of compromised vascular repair in people living with cardiometabolic disorders. The purpose of this study was to determine if elevated levels of Lp(a) modify VR cell content properties.

View Article and Find Full Text PDF

The presence of unavoidable background noise limits the signal-to-noise ratio in measured room impulse responses (RIRs). A common solution is to crop the RIR to the time interval where the signal dominates the background noise, but finding the correct onset and truncation points is challenging. It usually requires estimating the sound decay rate and noise floor, which is burdened with uncertainty.

View Article and Find Full Text PDF

In the presence of room reverberation and background noise, the performance of frame-level speaker localization is severely limited. To address this challenge, this study performs multi-channel speech enhancement based on complex spectral mapping (CSM), followed by direction-of-arrival (DOA) estimation using weighted generalized cross-correlation with phase transform (GCC-PHAT). The proposed approach differs from prevailing deep learning methods that operate on multi-channel inputs directly for speaker localization.

View Article and Find Full Text PDF

Robust regularized blind system identification with application to adaptive speech dereverberation.

J Acoust Soc Am

July 2025

School of Information and Control Engineering and Robot Technology Used for Special Environment Key Laboratory of Sichuan province, Southwest University of Science and Technology, Mianyang 621010, China.

The normalized multichannel frequency-domain least-mean square (NMCFLMS) algorithm is a prominent method for blind identification of multichannel acoustic systems. However, the NMCFLMS algorithm relies on a constant, determined by a block of microphone signals, to define the regularization parameter. This setup makes the algorithm sensitive to variations in speech segments and noise conditions.

View Article and Find Full Text PDF