Analysis of metabolomic PCA data using tree diagrams

Mark T. Werth, Steven Halouska, Matthew D. Shortridge, Bo Zhang, Robert Powers

Research output: Contribution to journalArticlepeer-review

51 Scopus citations


Large amounts of data from high-throughput metabolomic experiments are commonly visualized using a principal component analysis (PCA) two-dimensional scores plot. The question of the similarity or difference between multiple metabolic states then becomes a question of the degree of overlap between their respective data point clusters in principal component (PC) scores space. A qualitative visual inspection of the clustering pattern in PCA scores plots is a common protocol. This article describes the application of tree diagrams and bootstrapping techniques for an improved quantitative analysis of metabolic PCA data clustering. Our PCAtoTree program creates a distance matrix with 100 bootstrap steps that describes the separation of all clusters in a metabolic data set. Using accepted phylogenetic software, the distance matrix resulting from the various metabolic states is organized into a phylogenetic-like tree format, where bootstrap values ≥50 indicate a statistically relevant branch separation. PCAtoTree analysis of two previously published data sets demonstrates the improved resolution of metabolic state differences using tree diagrams. In addition, for metabolomic studies of large numbers of different metabolic states, the tree format provides a better description of similarities and differences between each metabolic state. The approach is also tolerant of sample size variations between different metabolic states.

Original languageEnglish (US)
Pages (from-to)58-63
Number of pages6
JournalAnalytical Biochemistry
Issue number1
StatePublished - Apr 2010


  • Bootstrap analysis
  • Metabolomics
  • NMR
  • Principal component analysis
  • Tree diagrams

ASJC Scopus subject areas

  • Biophysics
  • Biochemistry
  • Molecular Biology
  • Cell Biology


Dive into the research topics of 'Analysis of metabolomic PCA data using tree diagrams'. Together they form a unique fingerprint.

Cite this