In challenging conditions such as low signal-to-noise ratios and distant speech, microphone-based automatic speech recognition (ASR) struggles with clarity. To remedy this, laser Doppler vibrometer (LDV) technology is integrated into the ASR system and a data augmentation approach is employed to generate training data containing LDV attributes. The performance of the ASR, assessed using word error rates, showed superior results with the data augmentation approach compared to the baseline ASR system trained solely on real LDV data.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2024
Cochlear implants (CI) play a crucial role in restoring hearing for individuals with profound-to-severe hearing loss. However, challenges persist, particularly in low signal-to-noise ratios and distant talk scenarios. This study introduces an innovative solution by integrating a Laser Doppler vibrometer (LDV) with deep learning to reconstruct clean speech from unknown speakers in noisy conditions.
View Article and Find Full Text PDFObjective: Ambulatory phonation monitoring (APM) has a long evolving history. Current devices mostly use a contact microphone or accelerometer over the anterior neck, limiting its general acceptance outside of academic purposes. This study applied wireless Bluetooth earphones to receive voice signals.
View Article and Find Full Text PDFIEEE Trans Neural Syst Rehabil Eng
December 2023
Dysarthria, a speech disorder often caused by neurological damage, compromises the control of vocal muscles in patients, making their speech unclear and communication troublesome. Recently, voice-driven methods have been proposed to improve the speech intelligibility of patients with dysarthria. However, most methods require a significant representation of both the patient's and target speaker's corpus, which is problematic.
View Article and Find Full Text PDFIEEE Trans Biomed Eng
December 2023
Objective: Although many speech enhancement (SE) algorithms have been proposed to promote speech perception in hearing-impaired patients, the conventional SE approaches that perform well under quiet and/or stationary noises fail under nonstationary noises and/or when the speaker is at a considerable distance. Therefore, the objective of this study is to overcome the limitations of the conventional speech enhancement approaches.
Method: This study proposes a speaker-closed deep learning-based SE method together with an optical microphone to acquire and enhance the speech of a target speaker.
Objective: Doctors, nowadays, primarily use auditory-perceptual evaluation, such as the grade, roughness, breathiness, asthenia, and strain scale, to evaluate voice quality and determine the treatment. However, the results predicted by individual physicians often differ, because of subjective perceptions, and diagnosis time interval, if the patient's symptoms are hard to judge. Therefore, an accurate computerized pathological voice quality assessment system will improve the quality of assessment.
View Article and Find Full Text PDFBackground: The population of young adults who are hearing impaired increases yearly, and a device that enables convenient hearing screening could help monitor their hearing. However, background noise is a critical issue that limits the capabilities of such a device. Therefore, this study evaluated the effectiveness of commercial active noise cancellation (ANC) headphones for hearing screening applications in the presence of background noise.
View Article and Find Full Text PDFSensors (Basel)
September 2022
With the development of active noise cancellation (ANC) technology, ANC has been used to mitigate the effects of environmental noise on audiometric results. However, objective evaluation methods supporting the accuracy of audiometry for ANC exposure to different levels of noise have not been reported. Accordingly, the audio characteristics of three different ANC headphone models were quantified under different noise conditions and the feasibility of ANC in noisy environments was investigated.
View Article and Find Full Text PDFAnnu Int Conf IEEE Eng Med Biol Soc
July 2022
Envelope waveforms can be extracted from multiple frequency bands of a speech signal, and envelope waveforms carry important intelligibility information for human speech communication. This study aimed to investigate whether a deep learning-based model with features of temporal envelope information could synthesize an intelligible speech, and to study the effect of reducing the number (from 8 to 2 in this work) of temporal envelope information on the intelligibility of the synthesized speech. The objective evaluation metric of short-time objective intelligibility (STOI) showed that, on average, the synthesized speech of the proposed approach provided higher STOI (i.
View Article and Find Full Text PDFComput Methods Programs Biomed
March 2022
Background And Objective: Most dysarthric patients encounter communication problems due to unintelligible speech. Currently, there are many voice-driven systems aimed at improving their speech intelligibility; however, the intelligibility performance of these systems are affected by challenging application conditions (e.g.
View Article and Find Full Text PDFBackground: Cochlear implant technology is a well-known approach to help deaf individuals hear speech again and can improve speech intelligibility in quiet conditions; however, it still has room for improvement in noisy conditions. More recently, it has been proven that deep learning-based noise reduction, such as noise classification and deep denoising autoencoder (NC+DDAE), can benefit the intelligibility performance of patients with cochlear implants compared to classical noise reduction algorithms.
Objective: Following the successful implementation of the NC+DDAE model in our previous study, this study aimed to propose an advanced noise reduction system using knowledge transfer technology, called NC+DDAE_T; examine the proposed NC+DDAE_T noise reduction system using objective evaluations and subjective listening tests; and investigate which layer substitution of the knowledge transfer technology in the NC+DDAE_T noise reduction system provides the best outcome.
JMIR Mhealth Uhealth
December 2020
Background: Voice disorders mainly result from chronic overuse or abuse, particularly in occupational voice users such as teachers. Previous studies proposed a contact microphone attached to the anterior neck for ambulatory voice monitoring; however, the inconvenience associated with taping and wiring, along with the lack of real-time processing, has limited its clinical application.
Objective: This study aims to (1) propose an automatic speech detection system using wireless microphones for real-time ambulatory voice monitoring, (2) examine the detection accuracy under controlled environment and noisy conditions, and (3) report the results of the phonation ratio in practical scenarios.