Real and synthetic Punjabi speech datasets for automatic speech recognition.

Data Brief

School of Mathematical and Computational Sciences, Massey University, Auckland, New Zealand.

Published: February 2024


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Automatic speech recognition (ASR) has been an active area of research. Training with large annotated datasets is the key to the development of robust ASR systems. However, most available datasets are focused on high-resource languages like English, leaving a significant gap for low-resource languages. Among these languages is Punjabi, despite its large number of speakers, Punjabi lacks high-quality annotated datasets for accurate speech recognition. To address this gap, we introduce three labeled Punjabi speech datasets: Punjabi Speech (real speech dataset) and Google-synth/CMU-synth (synthesized speech datasets). The Punjabi Speech dataset consists of read speech recordings captured in various environments, including both studio and open settings. In addition, the Google-synth dataset is synthesized using Google's Punjabi text-to-speech cloud services. Furthermore, the CMU-synth dataset is created using the Clustergen model available in the Festival speech synthesis system developed by CMU. These datasets aim to facilitate the development of accurate Punjabi speech recognition systems, bridging the resource gap for this important language.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10749247PMC
http://dx.doi.org/10.1016/j.dib.2023.109865DOI Listing

Publication Analysis

Top Keywords

punjabi speech
20
speech recognition
16
speech
12
speech datasets
12
punjabi
8
automatic speech
8
annotated datasets
8
datasets punjabi
8
speech dataset
8
datasets
7

Similar Publications

Recent research has focused extensively on employing Deep Learning (DL) techniques, particularly Convolutional Neural Networks (CNN), for Speech Emotion Recognition (SER). This study addresses the burgeoning interest in leveraging DL for SER, specifically focusing on Punjabi language speakers. The paper presents a novel approach to constructing and preprocessing a labeled speech corpus using diverse social media sources.

View Article and Find Full Text PDF

Objectives: Complex head and neck defects involving composite defects can be reconstructed using chimeric flaps or multiple flaps with separate anastomoses. Limited comparisons exist between chimeric and multiple flap reconstructions. We compare outcomes between chimeric and multiple flap reconstructions in oral cavity reconstruction.

View Article and Find Full Text PDF

Prevalence and risk factors associated with multidrug-resistant bacteria in COVID-19 patients.

Medicine (Baltimore)

March 2024

Department of Speech-Language Pathology and Audiology, College of Applied Medical Sciences, University of Ha'il, Hail, Saudi Arabia.

Bacterial coinfection among patients with confirmed coronavirus disease 2019 (COVID-19) is a critical medical concern that increases the disease severity and mortality rate. The current study is aimed at evaluating the effects of bacterial coinfections among COVID-19 patients, especially in relation to degree of severity and mortality. A retrospective study was conducted for patients with positive COVID-19 test, admitted to a regional COVID-19 hospital in Jeddah, Saudi Arabia, between May and August 2020.

View Article and Find Full Text PDF

Real and synthetic Punjabi speech datasets for automatic speech recognition.

Data Brief

February 2024

School of Mathematical and Computational Sciences, Massey University, Auckland, New Zealand.

Automatic speech recognition (ASR) has been an active area of research. Training with large annotated datasets is the key to the development of robust ASR systems. However, most available datasets are focused on high-resource languages like English, leaving a significant gap for low-resource languages.

View Article and Find Full Text PDF

Associations of sleep characteristics in late midlife with late-life hearing loss in the Atherosclerosis Risk in Communities-Sleep Heart Health Study (ARIC-SHHS).

Sleep Health

October 2023

Cochlear Center for Hearing and Public Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; Department of Otolaryngology-Head & Neck Surgery, Johns Hopkins School of M

Objectives: This study investigated associations of late midlife sleep characteristics with late-life hearing, which adds to the existing cross-sectional evidence and is novel in examining polysomnographic sleep measures and central auditory processing.

Methods: A subset of Atherosclerosis Risk in Communities Study participants underwent sleep assessment in the Sleep Heart Health Study in 1996-1998 and hearing assessment in 2016-2017. Peripheral hearing thresholds (0.

View Article and Find Full Text PDF