Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea

Elease J. McLaurin, John D. Lee, Anthony D. McDonald, Nazan Aksan, Jeffrey Dawson, Jon Tippin, Matthew Rizzo

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


One challenge in using naturalistic driving data is producing a holistic analysis of these highly variable datasets. Typical analyses focus on isolated events, such as large g-force accelerations indicating a possible near-crash. Examining isolated events is ill-suited for identifying patterns in continuous activities such as maintaining vehicle control. We present an alternative approach that converts driving data into a text representation and uses topic modeling to identify patterns across the dataset. This approach enables the discovery of non-linear patterns, reduces the dimensionality of the data, and captures subtle variations in driver behavior. In this study topic models were used to concisely described patterns in trips from drivers with and without untreated obstructive sleep apnea (OSA). The analysis included 5000 trips (50 trips from 100 drivers; 66 drivers with OSA; 34 comparison drivers). Trips were treated as documents, and speed and acceleration data from the trips were converted to “driving words.” The identified patterns, called topics, were determined based on regularities in the co-occurrence of the driving words within the trips. This representation was used in random forest models to predict the driver condition (i.e., OSA or comparison) for each trip. Models with 10, 15 and 20 topics had better accuracy in predicting the driver condition, with a maximum AUC of 0.73 for a model with 20 topics. Trips from drivers with OSA were more likely to be defined by topics for smaller lateral accelerations at low speeds. The results demonstrate topic modeling as a useful tool for extracting meaningful information from naturalistic driving datasets.

Original languageEnglish (US)
Pages (from-to)25-38
Number of pages14
JournalTransportation Research Part F: Traffic Psychology and Behaviour
StatePublished - Oct 2018


  • Driver behavior
  • Drowsiness
  • Machine learning
  • Naturalistic driving data
  • Sleep apnea
  • Topic modeling

ASJC Scopus subject areas

  • Civil and Structural Engineering
  • Automotive Engineering
  • Transportation
  • Applied Psychology


Dive into the research topics of 'Using topic modeling to develop multi-level descriptions of naturalistic driving data from drivers with and without sleep apnea'. Together they form a unique fingerprint.

Cite this