Predicting Asthma Prevalence by Linking Social Media Data and Traditional Surveys

Hongying Dai, Brian R. Lee, Jianqiang Hao

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Asthma is one of the most common chronic diseases that has a profound impact on people’s well-being and our society. In this study, we link multiple large-scale data sources to construct an epidemiological model to predict asthma prevalence across geographic regions. We use: (1) the Social Media Monitoring (SMM) data from Twitter (N = 500 million tweets/day), (2) the 2014 Behavioral Risk Factor Surveillance System (BRFSS) (N = 464,664), and (3) the 2014 American Community Survey (ACS) conducted by the U.S. Census Bureau (N = 3.5 million per year). We predict asthma prevalence in the traditional survey (BRFSS) using social media information collected from Twitter and socioeconomic factors collected from ACS. The evidence suggests that monitoring asthma-related tweets may provide real-time information that can be used to predict outcomes from traditional surveys.

Original languageEnglish (US)
Pages (from-to)75-92
Number of pages18
JournalAnnals of the American Academy of Political and Social Science
Issue number1
StatePublished - Jan 1 2017
Externally publishedYes


  • ACS
  • SMM
  • asthma
  • data linkage
  • social media monitoring

ASJC Scopus subject areas

  • Sociology and Political Science
  • General Social Sciences


Dive into the research topics of 'Predicting Asthma Prevalence by Linking Social Media Data and Traditional Surveys'. Together they form a unique fingerprint.

Cite this