98%
921
2 minutes
20
Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4701269 | PMC |
http://dx.doi.org/10.3390/s151128070 | DOI Listing |
J Voice
September 2025
RISE-Health, Alameda Prof. Hernâni Monteiro, 4200-319 Porto, Portugal; Department of Otorhinolaryngology, Centro Hospitalar Universitário de São João, Porto, Portugal; Department of Surgery and Physiology, University of Porto - Faculty of Medicine, Alameda Prof. Hernâni Monteiro, 4200-319 Porto
This paper addresses two challenges that are intertwined and are key in informing signal processing methods restoring natural (voiced) speech from whispered speech. The first challenge involves characterizing and modeling the evolution of the harmonic phase/magnitude structure of a sequence of individual pitch periods in a voiced region of natural speech comprising sustained or co-articulated vowels. A novel algorithm segmenting individual pitch pulses is proposed, which is then used to obtain illustrative results highlighting important differences between sustained and co-articulated vowels, and suggesting practical synthetic voicing approaches.
View Article and Find Full Text PDFNat Commun
September 2025
Department of Chemical Engineering, Hanyang University, Seoul, Republic of Korea.
Sensorineural hearing loss is the most common form of deafness, typically resulting from the loss of sensory cells on the basilar membrane, which cannot regenerate and thus lose sensitivity to sound vibrations. Here, we report a reconfigurable piezo-ionotropic polymer membrane engineered for biomimetic sustainable multi-resonance acoustic sensing, offering exceptional sensitivity (530 kPa) and broadband frequency discrimination (20 Hz to 3300 Hz) while remaining resistant to "dying". The acoustic sensing capability is driven by an ion hitching-in cage effect intrinsic to the ion gel combined with fluorinated polyurethane.
View Article and Find Full Text PDFJ Biomed Inform
August 2025
Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Dr, Philadelphia, 19104, PA, USA; Department of Computer and Information Science, University of Pennsylvania, Levine Hall, 3330 Walnut St, Philadelphia, 19104, PA, USA. E
Objective: The increasing use of audio-video (AV) data in healthcare has improved patient care, clinical training, and medical and ethnographic research. However, it has also introduced major challenges in preserving patient-provider privacy due to Protected Health Information (PHI) in such data. Traditional de-identification methods are inadequate for AV data, which can reveal identifiable information such as faces, voices, and environmental details.
View Article and Find Full Text PDFJ Commun Disord
August 2025
College of Social Sciences, Arts, and Humanities, Al-Akhawayn University, Morocco. Electronic address:
This is the first comprehensive study to examine the feasibility of using acoustic measures to characterize coarticulatory dynamics in Arabic speakers with Broca's aphasia, addressing a significant gap in the literature and contributing to both universal and culturally specific understandings of coarticulatory timing in aphasia. Five Palestinian Arabic-speaking participants with Broca's aphasia and five control speakers completed a repetition task involving initial fricative-vowel syllables. Using PRAAT software, the analysis incorporates both static and dynamic acoustic parameters, including formant values (F2 and F3), transition slopes and variability, Voice Onset Time (VOT), and intensity measures.
View Article and Find Full Text PDFJASA Express Lett
August 2025
English Language and Linguistics, University of Glasgow, Glasgow, G12 8QQ, United Kingdom.
Speech synthesis has improved dramatically over recent years, enabled by large datasets and advances in neural network architectures. Little is known, however, about how synthesised speech patterns are realized from a phonetic perspective. By synthesising speech in two languages with differing implementations of stop voicing, we observe that synthesised speech broadly follows expected patterns for each language, though partially diverges for specific segments.
View Article and Find Full Text PDF