Real and synthetic Punjabi speech datasets for automatic speech recognition.

Satwinder Singh , Feng Hou , Ruili Wang

Data Brief

School of Mathematical and Computational Sciences, Massey University, Auckland, New Zealand.

Published: February 2024

Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

Automatic speech recognition (ASR) has been an active area of research. Training with large annotated datasets is the key to the development of robust ASR systems. However, most available datasets are focused on high-resource languages like English, leaving a significant gap for low-resource languages. Among these languages is Punjabi, despite its large number of speakers, Punjabi lacks high-quality annotated datasets for accurate speech recognition. To address this gap, we introduce three labeled Punjabi speech datasets: Punjabi Speech (real speech dataset) and Google-synth/CMU-synth (synthesized speech datasets). The Punjabi Speech dataset consists of read speech recordings captured in various environments, including both studio and open settings. In addition, the Google-synth dataset is synthesized using Google's Punjabi text-to-speech cloud services. Furthermore, the CMU-synth dataset is created using the Clustergen model available in the Festival speech synthesis system developed by CMU. These datasets aim to facilitate the development of accurate Punjabi speech recognition systems, bridging the resource gap for this important language.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10749247	PMC
http://dx.doi.org/10.1016/j.dib.2023.109865	DOI Listing

Publication Analysis

Top Keywords

punjabi speech

speech recognition

speech

speech datasets

punjabi

automatic speech

annotated datasets

datasets punjabi

speech dataset

datasets

Similar Publications

Emotion recognition for human-computer interaction using high-level descriptors.

Sci Rep

May 2024

Faculty of Electrical and Computer Engineering, Bahir Dar University, Bahir Dar, Ethiopia.

Chaitanya Singla , Sukhdev Singh , Preeti Sharma , Nitin Mittal , Fikreselam Gared

Recent research has focused extensively on employing Deep Learning (DL) techniques, particularly Convolutional Neural Networks (CNN), for Speech Emotion Recognition (SER). This study addresses the burgeoning interest in leveraging DL for SER, specifically focusing on Punjabi language speakers. The paper presents a novel approach to constructing and preprocessing a labeled speech corpus using diverse social media sources.

View Article and Find Full Text PDF

Similar Publications

Chimeric versus Multiple Flaps for Composite Oral Cavity Defects: A Systematic Review and Meta-Analysis.

Laryngoscope

October 2024

Division of Plastic and Reconstructive Surgery, Fox Chase Cancer Center, Philadelphia, Pennsylvania, U.S.A.

Ayesha Punjabi , Sthefano Araya , Grace Amadio , Theresa Webster , Sudeep Mutyala

Objectives: Complex head and neck defects involving composite defects can be reconstructed using chimeric flaps or multiple flaps with separate anastomoses. Limited comparisons exist between chimeric and multiple flap reconstructions. We compare outcomes between chimeric and multiple flap reconstructions in oral cavity reconstruction.

View Article and Find Full Text PDF

Similar Publications

Prevalence and risk factors associated with multidrug-resistant bacteria in COVID-19 patients.

Medicine (Baltimore)

March 2024

Department of Speech-Language Pathology and Audiology, College of Applied Medical Sciences, University of Ha'il, Hail, Saudi Arabia.

Abdu Aldarhami , Ahmed A Punjabi , Abdulrahman S Bazaid , Naif K Binsaleh , Omar W Althomali

Bacterial coinfection among patients with confirmed coronavirus disease 2019 (COVID-19) is a critical medical concern that increases the disease severity and mortality rate. The current study is aimed at evaluating the effects of bacterial coinfections among COVID-19 patients, especially in relation to degree of severity and mortality. A retrospective study was conducted for patients with positive COVID-19 test, admitted to a regional COVID-19 hospital in Jeddah, Saudi Arabia, between May and August 2020.

View Article and Find Full Text PDF

Similar Publications

Real and synthetic Punjabi speech datasets for automatic speech recognition.

Data Brief

February 2024

School of Mathematical and Computational Sciences, Massey University, Auckland, New Zealand.

Satwinder Singh , Feng Hou , Ruili Wang

View Article and Find Full Text PDF

Similar Publications

Associations of sleep characteristics in late midlife with late-life hearing loss in the Atherosclerosis Risk in Communities-Sleep Heart Health Study (ARIC-SHHS).

Sleep Health

October 2023

Cochlear Center for Hearing and Public Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA; Department of Otolaryngology-Head & Neck Surgery, Johns Hopkins School of M

Kening Jiang , Adam P Spira , Rebecca F Gottesman , Kelsie M Full , Frank R Lin

Objectives: This study investigated associations of late midlife sleep characteristics with late-life hearing, which adds to the existing cross-sectional evidence and is novel in examining polysomnographic sleep measures and central auditory processing.

Methods: A subset of Atherosclerosis Risk in Communities Study participants underwent sleep assessment in the Sleep Heart Health Study in 1996-1998 and hearing assessment in 2016-2017. Peripheral hearing thresholds (0.

View Article and Find Full Text PDF

Similar Publications