Echo-aware room impulse response generation.

Seongrae Kim , Jae-Hyoun Yoo , Jung-Woo Choi

J Acoust Soc Am

School of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon, 34141, South Korea.

Published: July 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

In real-time applications, like interactive virtual reality environments, there is a significant need for low-complexity simulation of room impulse responses in highly complex virtual scenes, but this remains a challenging issue. In particular, simulating late reverberation using physically based acoustic modeling requires much computational effort, contrary to the early reflections that can be modeled by simpler techniques, e.g., the image source method. To tackle this computational complexity issue, we propose a neural network-based hybrid artificial reverberation framework (Echo2Reverb) that generates late reverberation from given early reflections. The proposed model can control both temporal texture and frequency-dependent energy decay, i.e., echo density and spectral energy distribution, of the generated reverberations by extracting spectral and echo-related features and filtering sampled sparse sequences and Gaussian noises using estimated features. To support the end-to-end training with controlled echo density, a differentiable approximation of the normalized echo density profile is proposed. We train and test the model not only for nearly diffuse but also distinct echoes prominent in late reverberations, such as with flutter echoes in narrow corridors. Evaluation results demonstrate that the proposed model can accurately reproduce frequency-dependent energy decay and temporal texture of a room impulse response using only early reflections.

Download full-text PDF	Source
http://dx.doi.org/10.1121/10.0027931	DOI Listing

Publication Analysis

Top Keywords

room impulse

early reflections

echo density

impulse response

late reverberation

proposed model

temporal texture

frequency-dependent energy

energy decay

echo-aware room

Similar Publications

Episodic memory and inhibitory control in people who inject substances results of cohort COSINUS study.

BMC Psychiatry

September 2025

Centre d'étude Des Mouvements Sociaux (Inserm U1276, UMR CNRS 8044, EHESS/Paris), Paris, France.

Laurence Lalanne , Sébastien Kirchherr , Martin Audran , Sebastien de Dinechin , Naomi Hamelin

Background: Cognitive disorders associated with addictive disorders are well established in the literature for numerous substances and behaviours. Very few studies have examined the effect of polydrug use on cognitive functioning. These studies have focused on the cognitive effect of one substance among others in very small samples.

View Article and Find Full Text PDF

Similar Publications

Vascular regenerative deficiencies in people with elevated lipoprotein(a): the Lp(a)-VRCE CardioLink-16 translational study.

Cardiovasc Res

August 2025

Division of Cardiac Surgery, St Michael's Hospital of Unity Health Toronto, 30 Bond Street, Toronto, ON, Canada M5B 1W5.

Michael Moroney , Jack H Casey , Hwee Teoh , Aishwarya Krishnaraj , Yi Pan

Aims: Lipoprotein(a) [Lp(a)] is a causal risk factor for atherosclerotic cardiovascular disease (ASCVD); however, the relationship between Lp(a) and the capacity for vascular repair remains unclear. Depletion of vascular regenerative (VR) progenitor cells has been shown to be a novel indicator of compromised vascular repair in people living with cardiometabolic disorders. The purpose of this study was to determine if elevated levels of Lp(a) modify VR cell content properties.

View Article and Find Full Text PDF

Similar Publications

Cropping room impulse responses using unimodal regression of their covariance.

JASA Express Lett

August 2025

Multimedia Communications and Signal Processing, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen,

Karolina Prawda , Nils Meyer-Kahlen , Sebastian J Schlecht

The presence of unavoidable background noise limits the signal-to-noise ratio in measured room impulse responses (RIRs). A common solution is to crop the RIR to the time interval where the signal dominates the background noise, but finding the correct onset and truncation points is challenging. It usually requires estimating the sound decay rate and noise floor, which is burdened with uncertainty.

View Article and Find Full Text PDF

Similar Publications

Robust frame-level speaker localization guided by multi-channel speech enhancement and inter-channel phase-difference losses.

J Acoust Soc Am

July 2025

Department of Computer Science and Engineering, The Ohio State University, Columbus, Ohio 43210, USA.

Shanmukha Srinivas Battula , Hassan Taherian , Ashutosh Pandey , Daniel Wong , Buye Xu

In the presence of room reverberation and background noise, the performance of frame-level speaker localization is severely limited. To address this challenge, this study performs multi-channel speech enhancement based on complex spectral mapping (CSM), followed by direction-of-arrival (DOA) estimation using weighted generalized cross-correlation with phase transform (GCC-PHAT). The proposed approach differs from prevailing deep learning methods that operate on multi-channel inputs directly for speaker localization.

View Article and Find Full Text PDF

Similar Publications

Robust regularized blind system identification with application to adaptive speech dereverberation.

J Acoust Soc Am

July 2025

School of Information and Control Engineering and Robot Technology Used for Special Environment Key Laboratory of Sichuan province, Southwest University of Science and Technology, Mianyang 621010, China.

Zhimin Qiu , Hongsen He , Jingdong Chen , Jacob Benesty , Yi Yu

The normalized multichannel frequency-domain least-mean square (NMCFLMS) algorithm is a prominent method for blind identification of multichannel acoustic systems. However, the NMCFLMS algorithm relies on a constant, determined by a block of microphone signals, to define the regularization parameter. This setup makes the algorithm sensitive to variations in speech segments and noise conditions.

View Article and Find Full Text PDF

Similar Publications