Identifying host-specific amino acid signatures for influenza A viruses using an adjusted entropy measure

Yixiang Zhang, Kent M. Eskridge, Shunpu Zhang, Guoqing Lu

Research output: Contribution to journalArticlepeer-review


Background: Influenza A viruses (IAV) exhibit vast genetic mutability and have great zoonotic potential to infect avian and mammalian hosts and are known to be responsible for a number of pandemics. A key computational issue in influenza prevention and control is the identification of molecular signatures with cross-species transmission potential. We propose an adjusted entropy-based host-specific signature identification method that uses a similarity coefficient to incorporate the amino acid substitution information and improve the identification performance. Mutations in the polymerase genes (e.g., PB2) are known to play a major role in avian influenza virus adaptation to mammalian hosts. We thus focus on the analysis of PB2 protein sequences and identify host specific PB2 amino acid signatures. Results: Validation with a set of H5N1 PB2 sequences from 1996 to 2006 results in adjusted entropy having a 40% false negative discovery rate compared to a 60% false negative rate using unadjusted entropy. Simulations across different levels of sequence divergence show a false negative rate of no higher than 10% while unadjusted entropy ranged from 9 to 100%. In addition, under all levels of divergence adjusted entropy never had a false positive rate higher than 9%. Adjusted entropy also identifies important mutations in H1N1pdm PB2 previously identified in the literature that explain changes in divergence between 2008 and 2009 which unadjusted entropy could not identify. Conclusions: Based on these results, adjusted entropy provides a reliable and widely applicable host signature identification approach useful for IAV monitoring and vaccine development.

Original languageEnglish (US)
Article number333
JournalBMC bioinformatics
Issue number1
StatePublished - Dec 2022


  • Adjusted entropy
  • Amino acid signatures
  • Host specificity
  • Influenza A virus

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Applied Mathematics


Dive into the research topics of 'Identifying host-specific amino acid signatures for influenza A viruses using an adjusted entropy measure'. Together they form a unique fingerprint.

Cite this