Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: Insights into the pneumococcal supragenome

N. Luisa Hiller, Benjamin Janto, Justin S. Hogg, Robert Boissy, Susan Yu, Evan Powell, Randy Keefe, Nathan E. Ehrlich, Kai Shen, Jay Hayes, Karen Barbadora, William Klimke, Dmitry Dernovoy, Tatiana Tatusova, Julian Parkhill, Stephen D. Bentley, J. Christopher Post, Garth D. Ehrlich, Fen Z. Hu

Research output: Contribution to journalArticle

207 Scopus citations

Abstract

The distributed-genome hypothesis (DGH) states that pathogenic bacteria possess a supragenome that is much larger than the genome of any single bacterium and that these pathogens utilize genetic recombination and a large, noncore set of genes as a means of diversity generation. We sequenced the genomes of eight nasopharyngeal strains of Streptococcus pneumoniae isolated from pediatric patients with upper respiratory symptoms and performed quantitative genomic analyses among these and nine publicly available pneumococcal strains. Coding sequences from all strains were grouped into 3,170 orthologous gene clusters, of which 1,454 (46%) were conserved among all 17 strains. The majority of the gene clusters, 1,716 (54%), were not found in all strains. Genie differences per strain pair ranged from 35 to 629 orthologous clusters, with each strain's genome containing between 21 and 32% noncore genes. The distribution of the orthologous clusters per genome for the 17 strains was entered into the finite-supragenome model, which predicted that (i) the S. pneumoniae supragenome contains more than 5,000 orthologous clusters and (ii) 99% of the orthologous clusters (∼3,000) that are represented in the S. pneumoniae population at frequencies of ≥0.1 can be identified if 33 representative genomes are sequenced. These extensive genie diversity data support the DGH and provide a basis for understanding the great differences in clinical phenotype associated with various pneumococcal strains. When these findings are taken together with previous studies that demonstrated the presence of a supragenome for Streptococcus agalactiae and Haemophilus influenzae, it appears that the possession of a distributed genome is a common host interaction strategy.

Original languageEnglish (US)
Pages (from-to)8186-8195
Number of pages10
JournalJournal of bacteriology
Volume189
Issue number22
DOIs
StatePublished - Nov 2007

ASJC Scopus subject areas

  • Microbiology
  • Molecular Biology

Fingerprint Dive into the research topics of 'Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: Insights into the pneumococcal supragenome'. Together they form a unique fingerprint.

  • Cite this

    Hiller, N. L., Janto, B., Hogg, J. S., Boissy, R., Yu, S., Powell, E., Keefe, R., Ehrlich, N. E., Shen, K., Hayes, J., Barbadora, K., Klimke, W., Dernovoy, D., Tatusova, T., Parkhill, J., Bentley, S. D., Post, J. C., Ehrlich, G. D., & Hu, F. Z. (2007). Comparative genomic analyses of seventeen Streptococcus pneumoniae strains: Insights into the pneumococcal supragenome. Journal of bacteriology, 189(22), 8186-8195. https://doi.org/10.1128/JB.00690-07