Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

Human speech perception is multisensory, integrating auditory information from the talker's voice with visual information from the talker's face. BOLD fMRI studies have implicated the superior temporal gyrus (STG) in processing auditory speech and the superior temporal sulcus (STS) in integrating auditory and visual speech, but as an indirect hemodynamic measure, fMRI is limited in its ability to track the rapid neural computations underlying speech perception. Using stereoelectroencephalograpy (sEEG) electrodes, we directly recorded from the STG and STS in 42 epilepsy patients (25 F, 17 M). Participants identified single words presented in auditory, visual and audiovisual formats with and without added auditory noise. Seeing the talker's face provided a strong perceptual benefit, improving perception of noisy speech in every participant. Neurally, a subpopulation of electrodes concentrated in mid-posterior STG and STS responded to both auditory speech (latency 71 ms) and visual speech (109 ms). Significant multisensory enhancement was observed, especially in the upper bank of the STS: compared with auditory-only speech, the response latency for audiovisual speech was 40% faster and the response amplitude was 18% larger. In contrast, STG showed neither faster nor larger multisensory responses. Surprisingly, STS response latencies for audiovisual speech were significantly faster than those in the STG (50 ms 64 ms), suggesting a parallel pathway model in which the STG plays the primary role in auditory-only speech perception, while the STS takes the lead in audiovisual speech perception. Together with fMRI, sEEG provides converging evidence that STS plays a key role in multisensory integration. One of the most important functions of the human brain is to communicate with others. During conversation, humans take advantage of visual information from the face of the talker as well as auditory information from the voice of the talker. We directly recorded activity from the brains of epilepsy patients implanted with electrodes in the superior temporal sulcus (STS), a key brain region for speech perception. These recordings showed that hearing the voice and seeing the face of the talker evoked larger and faster neural responses in STS than the talker's voice alone. Multisensory enhancement in the STS may be the neural basis for our ability to better understand noisy speech when we can see the face of the talker.

Download full-text PDF

Source
http://dx.doi.org/10.1523/JNEUROSCI.1037-25.2025DOI Listing

Publication Analysis

Top Keywords

speech perception
24
superior temporal
16
audiovisual speech
16
speech
15
temporal sulcus
12
face talker
12
sts
10
multisensory integration
8
integrating auditory
8
talker's voice
8

Similar Publications

A growing literature explores the representational detail of infants' early lexical representations, but no study has investigated how exposure to real-life acoustic-phonetic variation impacts these representations. Indeed, previous experimental work with young infants has largely ignored the impact of accent exposure on lexical development. We ask how routine exposure to accent variation affects 6-month-olds' ability to detect mispronunciations.

View Article and Find Full Text PDF

Human speech perception is multisensory, integrating auditory information from the talker's voice with visual information from the talker's face. BOLD fMRI studies have implicated the superior temporal gyrus (STG) in processing auditory speech and the superior temporal sulcus (STS) in integrating auditory and visual speech, but as an indirect hemodynamic measure, fMRI is limited in its ability to track the rapid neural computations underlying speech perception. Using stereoelectroencephalograpy (sEEG) electrodes, we directly recorded from the STG and STS in 42 epilepsy patients (25 F, 17 M).

View Article and Find Full Text PDF

Neural entrainment by speech in human auditory cortex revealed by intracranial recordings.

Prog Neurobiol

September 2025

The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, United States; Elmezzi Graduate School of Molecular Medicine at Northwell Health, Manhasset, NY, United States; Department of Neurosurgery, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY, United States; Tr

Humans live in an environment that contains rich auditory stimuli, which must be processed efficiently. The entrainment of neural oscillations to acoustic inputs may support the processing of simple and complex sounds. However, the characteristics of this entrainment process have been shown to be inconsistent across species and experimental paradigms.

View Article and Find Full Text PDF

The tracking umbrella: Diverse interpretations under a common neural term.

Ann N Y Acad Sci

September 2025

BCBL, Basque Center on Cognition, Brain and Language, Donostia, Spain.

Neural tracking, the alignment of brain activity with the temporal dynamics of sensory input, is a crucial mechanism underlying perception, attention, and cognition. While this concept has gained prominence in research on speech, music, and visual processing, its definition and methodological approaches remain heterogeneous. This paper critically examines neural tracking from both theoretical and methodological perspectives, highlighting how its interpretation varies across studies.

View Article and Find Full Text PDF

The human auditory system must distinguish relevant sounds from noise. Severe hearing loss can be treated with cochlear implants (CIs), but how the brain adapts to electrical hearing remains unclear. This study examined adaptation to unilateral CI use in the first and seventh months after CI activation using speech comprehension measures and electroencephalography recordings, both during passive listening and an active spatial listening task.

View Article and Find Full Text PDF