Across-speaker articulatory normalization for speaker-independent silent speech recognition

Jun Wang, Ashok Samal, Jordan R. Green

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

Silent speech interfaces (SSIs), which recognize speech from articulatory information (i.e., without using audio information), have the potential to enable persons with laryngectomy or a neurological disease to produce synthesized speech with a natural sounding voice using their tongue and lips. Current approaches to SSIs have largely relied on speaker-dependent recognition models to minimize the negative effects of talker variation on recognition accuracy. Speaker-independent approaches are needed to reduce the large amount of training data required from each user; only limited articulatory samples are often available for persons with moderate to severe speech impairments, due to the logistic difficulty of data collection. This paper reported an across-speaker articulatory normalization approach based on Procrustes matching, a bidimensional regression technique for removing translational, scaling, and rotational effects of spatial data. A dataset of short functional sentences was collected from seven English talkers. A support vector machine was then trained to classify sentences based on normalized tongue and lip movements. Speaker-independent classification accuracy (tested using leave-one-subject-out cross validation) improved significantly, from 68.63% to 95.90%, following normalization. These results support the feasibility of a speaker-independent SSI using Procrustes matching as the basis for articulatory normalization across speakers.

Original languageEnglish (US)
Title of host publicationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
PublisherInternational Speech and Communication Association
Pages1179-1183
Number of pages5
StatePublished - 2014
Event15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
Duration: Sep 14 2014Sep 18 2014

Other

Other15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014
CountrySingapore
CitySingapore
Period9/14/149/18/14

Keywords

  • Procrustes analysis
  • Silent speech recognition
  • Speech kinematics
  • Support vector machine

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'Across-speaker articulatory normalization for speaker-independent silent speech recognition'. Together they form a unique fingerprint.

  • Cite this

    Wang, J., Samal, A., & Green, J. R. (2014). Across-speaker articulatory normalization for speaker-independent silent speech recognition. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 1179-1183). International Speech and Communication Association.