Effects of natural variability in cross-modal temporal correlations on audiovisual speech recognition benefit

Research output: Contribution to journalConference articlepeer-review

Abstract

In audiovisual (AV) speech, correlations over time between visible mouth movements and the amplitude envelope of auditory speech help to reduce uncertainty as to when peaks in the auditory signal will occur. Previous studies demonstrated greater AV benefit to speech detection in noise for sentences with higher cross-modal correlations than sentences with lower cross-modal correlations. This study examined whether the mechanisms that underlie AV detection benefits have downstream effects on speech recognition in noise. Participants were presented 72 sentences in noise, in auditory-only and AV conditions, at either their 50% auditory speech recognition threshold in noise (SRT-50) or at a signal-to-noise ratio (SNR) 6 dB poorer than their SRT-50. They were asked to repeat each sentence. Mean AV benefit across subjects was calculated for each sentence. Pearson correlations and mixed modeling were used to examined whether variability in AV benefit across sentences was related to natural variation in the degree of cross-modal correlation across sentences. In the more difficult listening condition, higher crossmodal correlations were associated with higher AV sentence recognition benefit. The relationship was strongest in the 0.8-2.2 kHz and 0.8-6 kHz frequency regions. These results demonstrate that cross-modal correlations contribute to variability in AV speech recognition in noise.

Keywords

  • Audiovisual
  • Multimodal
  • Speech recognition

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'Effects of natural variability in cross-modal temporal correlations on audiovisual speech recognition benefit'. Together they form a unique fingerprint.

Cite this