A Bayes testing approach to metagenomic profiling in bacteria

Bertrand Clarke, Camilo Valdes, Adrian Dobra, Jennifer Clarke

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Using next generation sequencing (NGS) data, we use a multinomial with a Dirichlet prior to detect the presence of bacteria in a metagenomic sample via marginal Bayes testing for each bacterial strain. The NGS reads per strain are counted fractionally with each read contributing an equal amount to each strain it might represent. The threshold for detection is strain-dependent and we apply a correction for the dependence amongst the (NGS) reads by finding the knee in a curve representing a tradeoff between detecting too many strains and not enough strains. As a check, we evaluate the joint posterior probabilities for the presence of two strains of bacteria and find relatively little dependence. We apply our techniques to two data sets and compare our results with the results found by the Human Microbiome Project. We conclude with a discussion of the issues surrounding multiple corrections in a Bayes context.

Original languageEnglish (US)
Pages (from-to)173-185
Number of pages13
JournalStatistics and its Interface
Issue number2
StatePublished - 2015


  • Bacteria
  • Bayes testing
  • Dependence
  • Metagenomics

ASJC Scopus subject areas

  • Statistics and Probability
  • Applied Mathematics


Dive into the research topics of 'A Bayes testing approach to metagenomic profiling in bacteria'. Together they form a unique fingerprint.

Cite this