An optimal set of flesh points on tongue and lips for speech-movement classification

Jun Wang, Ashok Samal, Panying Rong, Jordan R. Green

Research output: Contribution to journalArticlepeer-review

33 Scopus citations

Abstract

Purpose: The authors sought to determine an optimal set of flesh points on the tongue and lips for classifying speech movements. Method: The authors used electromagnetic articulographs (Carstens AG500 and NDI Wave) to record tongue and lip movements from 13 healthy talkers who articulated 8 vowels, 11 consonants, a phonetically balanced set of words, and a set of short phrases during the recording. We used a machine-learning classifier (supportector machine) to classify the speech stimuli on the basis of articulatory movements. We then compared classification accuracies of the flesh-point combinations to determine an optimal set of sensors. Results: When data from the 4 sensors (T1: the vicinity between the tongue tip and tongue blade; T4: the tongue-body back; UL: the upper lip; and LL: the lower lip) were combined, phoneme and word classifications were most accurate and were comparable with the full set (including T2: the tongue-body front; and T3: the tongue-body front). Conclusion: We identified a 4-sensor set—that is, T1, T4, UL, LL that yielded a classification accuracy (91%–95%) equivalent to that using all 6 sensors. These findings provide an empirical basis for selecting sensors and their locations for scientific and emerging clinical applications that incorporate articulatory movements.

Original languageEnglish (US)
Pages (from-to)15-26
Number of pages12
JournalJournal of Speech, Language, and Hearing Research
Volume59
Issue number1
DOIs
StatePublished - Feb 2016

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'An optimal set of flesh points on tongue and lips for speech-movement classification'. Together they form a unique fingerprint.

Cite this