Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

The proliferation of scientific podcasts has generated an extensive repository of educational content, rich in specialized terminology, diverse topics, and expert dialogues. Here, we introduce a computational framework designed to enhance large language models by leveraging this informational content from publicly accessible audio podcasts across science, technology, engineering, mathematics, and medicine (STEMM). This dataset, comprising over 3700 hours of audio content, was transcribed to generate over 42 million text tokens. Our model, PodGPT, integrates this wealth of complex dialogue found in audio podcasts to improve understanding of natural language nuances, cultural contexts, as well as scientific and medical knowledge. PodGPT also employs retrieval augmented generation (RAG) on a vector database, providing real-time access to emerging scientific literature. Evaluated on multiple benchmarks, PodGPT demonstrated an average improvement of 1.82 percentage points over standard open-source benchmarks and 2.43 percentage points when augmented with evidence from the RAG pipeline. Moreover, it showcased an average improvement of 1.18 percentage points in its zero-shot multilingual transfer ability, effectively generalizing to different linguistic contexts. By harnessing the untapped potential of podcast content, PodGPT advances natural language processing and conversational AI, offering enhanced capabilities for STEMM research and education.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC12234354PMC
http://dx.doi.org/10.1038/s44385-025-00022-0DOI Listing

Publication Analysis

Top Keywords

percentage points
12
large language
8
audio podcasts
8
natural language
8
average improvement
8
podgpt
5
podgpt audio-augmented
4
audio-augmented large
4
language
4
language model
4

Similar Publications

Ectopic pregnancy epidemiology from 1990 to 2021: A global burden of disease (GBD) analysis of 204 countries and territories.

Eur J Obstet Gynecol Reprod Biol

September 2025

Department of Obstetrics and Gynecology, Zhongda Hospital, Southeast University, Nanjing, Jiangsu, China. Electronic address:

Background: Ectopic pregnancy (EP) represents a leading cause of maternal mortality in early gestation and a significant contributor to future reproductive impairment. Comprehensive understanding of global EP epidemiological patterns and their temporal evolution is crucial for developing holistic strategies to promote health equity and optimize allocation of medical resources worldwide.

Methods: Leveraging Global Burden of Disease (GBD) 2021 data, this investigation systematically examined age-standardized rates (ASRs) of EP incidence, prevalence, mortality, and disability-adjusted life years (DALYs) across 204 countries and 21 regions from 1990 to 2021.

View Article and Find Full Text PDF

This study aimed to measure the absolute and relative differences in the recommended practice of leisure-time physical activity (LTPA) of Brazilian men and women between 2010 and 2019. The sample consisted of 512,968 subjects from ten cross-sectional telephone surveys carried out in the 27 Brazilian capitals. The gap in the prevalence of LTPA practice between genders was calculated by measures of absolute inequality, calculated in percentage points, and relative inequality, calculated by the adjusted prevalence ratio (PR), with a trend analyzed by the Joinpoint regression method, obtaining the annual percentage change (APC).

View Article and Find Full Text PDF

Antimicrobial resistance is largely driven by overuse of antibiotics, which is particularly common in low- and middle-income countries. We combine provider knowledge assessments and over 2000 anonymous standardized patient visits to providers in India to examine why they overprescribe antibiotics for pediatric diarrhea and figure out how to reduce overprescribing. Seventy percent of providers prescribed antibiotics without indication of bacterial infection.

View Article and Find Full Text PDF

Objectives: In this study, we examine the dynamics of birthing women relative to other family members in making caregiving decisions about postpartum maternal and infant care in four states in India. Specifically, we investigate the involvement of the father, maternal grandmother, and paternal grandmother of the newborn in household health decision-making.

Methods: We analyze data from 551 dyads of women with infants under six months and the family caregiver identified as providing the primary support in the postpartum period.

View Article and Find Full Text PDF

Importance: Multiparametric magnetic resonance imaging (MRI), with or without prostate biopsy, has become the standard of care for diagnosing clinically significant prostate cancer. Resource capacity limits widespread adoption. Biparametric MRI, which omits the gadolinium contrast sequence, is a shorter and cheaper alternative offering time-saving capacity gains for health systems globally.

View Article and Find Full Text PDF