The effect of target/masker fundamental frequency contour similarity on masked-speech recognition

Lauren Calandruccio, Peter A. Wasiuk, Emily Buss, Lori J. Leibold, Jessica Kong, Ann Holmes, Jacob Oleson

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Greater informational masking is observed when the target and masker speech are more perceptually similar. Fundamental frequency (f0) contour, or the dynamic movement of f0, is thought to provide cues for segregating target speech presented in a speech masker. Most of the data demonstrating this effect have been collected using digitally modified stimuli. Less work has been done exploring the role of f0 contour for speech-in-speech recognition when all of the stimuli have been produced naturally. The goal of this project was to explore the importance of target and masker f0 contour similarity by manipulating the speaking style of talkers producing the target and masker speech streams. Sentence recognition thresholds were evaluated for target and masker speech that was produced with either flat, normal, or exaggerated speaking styles; performance was also measured in speech spectrum shaped noise and for conditions in which the stimuli were processed through an ideal-binary mask. Results confirmed that similarities in f0 contour depth elevated speech-in-speech recognition thresholds; however, when the target and masker had similar contour depths, targets with normal f0 contours were more resistant to masking than targets with flat or exaggerated contours. Differences in energetic masking across stimuli cannot account for these results.

Original languageEnglish (US)
Pages (from-to)1065-1076
Number of pages12
JournalJournal of the Acoustical Society of America
Volume146
Issue number2
DOIs
StatePublished - Aug 1 2019

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Fingerprint Dive into the research topics of 'The effect of target/masker fundamental frequency contour similarity on masked-speech recognition'. Together they form a unique fingerprint.

Cite this