A bag-of-words feature engineering approach for assessing health conditions using accelerometer data

Elham Rastegari, Hesham Ali

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


Widespread use of wearable technologies offers great opportunities to objectively monitor human movements seamlessly and remotely over time. This, in turn, may provide data critical for next-generation healthcare decision-making and consequently help physicians and healthcare providers to diagnose health hazards in early stages. Parkinson's disease is a neurodegenerative disorder impacting gait and movement. Patients with Parkinson's disease suffer from a variety of motor function impairments and their gait is different from their healthy age-matched counterparts. In this work, we introduce and develop a bag-of-words feature engineering approach for analyzing movement data and differentiating between patients with pathology and healthy individuals. The first step is to divide each time series into subsequences using an overlapping sliding window. In the second step, similar patterns in the collected data are identified and a vocabulary is generated using the identified patterns. Then, word features are calculated based on the similarity of each subsequence to the vocabulary's patterns. We evaluated this method and compared it with a statistical feature representation method for discriminating between patients with Parkinson's disease and their healthy age-matched counterparts. Our results show that the bag-of-words approach outperforms the epoch-based statistical approach. The proposed method along with decision tree classifiers provide the highest accuracy, precision, and recall among the methods tested. Although we evaluated the bag-of-words method for Parkinson's disease, this method can be expanded for diagnosis and prognosis of other health conditions.

Original languageEnglish (US)
Article number100116
JournalSmart Health
StatePublished - May 2020


  • Bag of words
  • Feature engineering
  • Gait
  • Healthcare
  • Machine learning
  • Parkinson's disease
  • Wearable

ASJC Scopus subject areas

  • Medicine (miscellaneous)
  • Information Systems
  • Health Informatics
  • Computer Science Applications
  • Health Information Management


Dive into the research topics of 'A bag-of-words feature engineering approach for assessing health conditions using accelerometer data'. Together they form a unique fingerprint.

Cite this