TY - JOUR
T1 - Effects of natural variability in cross-modal temporal correlations on audiovisual speech recognition benefit
AU - Lalonde, Kaylah
N1 - Funding Information:
This research was funded by an IDeA CTR pilot grant (NIH-NIGMS, 1 U54 GM115458) and supported by the Technical Core within Boys Town National Research Hospital (NIH-NIGMS P20 GM109023). Seth Bashford, Tim Vallier, and Denis Fitzpatrick contributed to hardware and software development. Jamie Petersen and Nancy He collected behavioral data.
Publisher Copyright:
Copyright © 2019 ISCA
PY - 2019
Y1 - 2019
N2 - In audiovisual (AV) speech, correlations over time between visible mouth movements and the amplitude envelope of auditory speech help to reduce uncertainty as to when peaks in the auditory signal will occur. Previous studies demonstrated greater AV benefit to speech detection in noise for sentences with higher cross-modal correlations than sentences with lower cross-modal correlations. This study examined whether the mechanisms that underlie AV detection benefits have downstream effects on speech recognition in noise. Participants were presented 72 sentences in noise, in auditory-only and AV conditions, at either their 50% auditory speech recognition threshold in noise (SRT-50) or at a signal-to-noise ratio (SNR) 6 dB poorer than their SRT-50. They were asked to repeat each sentence. Mean AV benefit across subjects was calculated for each sentence. Pearson correlations and mixed modeling were used to examined whether variability in AV benefit across sentences was related to natural variation in the degree of cross-modal correlation across sentences. In the more difficult listening condition, higher crossmodal correlations were associated with higher AV sentence recognition benefit. The relationship was strongest in the 0.8-2.2 kHz and 0.8-6 kHz frequency regions. These results demonstrate that cross-modal correlations contribute to variability in AV speech recognition in noise.
AB - In audiovisual (AV) speech, correlations over time between visible mouth movements and the amplitude envelope of auditory speech help to reduce uncertainty as to when peaks in the auditory signal will occur. Previous studies demonstrated greater AV benefit to speech detection in noise for sentences with higher cross-modal correlations than sentences with lower cross-modal correlations. This study examined whether the mechanisms that underlie AV detection benefits have downstream effects on speech recognition in noise. Participants were presented 72 sentences in noise, in auditory-only and AV conditions, at either their 50% auditory speech recognition threshold in noise (SRT-50) or at a signal-to-noise ratio (SNR) 6 dB poorer than their SRT-50. They were asked to repeat each sentence. Mean AV benefit across subjects was calculated for each sentence. Pearson correlations and mixed modeling were used to examined whether variability in AV benefit across sentences was related to natural variation in the degree of cross-modal correlation across sentences. In the more difficult listening condition, higher crossmodal correlations were associated with higher AV sentence recognition benefit. The relationship was strongest in the 0.8-2.2 kHz and 0.8-6 kHz frequency regions. These results demonstrate that cross-modal correlations contribute to variability in AV speech recognition in noise.
KW - Audiovisual
KW - Multimodal
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=85074731223&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074731223&partnerID=8YFLogxK
U2 - 10.21437/Interspeech.2019-2931
DO - 10.21437/Interspeech.2019-2931
M3 - Conference article
AN - SCOPUS:85074731223
SN - 2308-457X
VL - 2019-September
SP - 2260
EP - 2264
JO - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
JF - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
T2 - 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019
Y2 - 15 September 2019 through 19 September 2019
ER -