Phylogenetic continuum indicates “Galaxies” in the protein universe: Preliminary results on the natural group structures of proteins

István Ladunga

Research output: Contribution to journalArticle

11 Scopus citations

Abstract

The markedly nonuniform, even systematic distribution of sequences in the protein “universe” has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two χ2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.

Original languageEnglish (US)
Pages (from-to)358-375
Number of pages18
JournalJournal of Molecular Evolution
Volume34
Issue number4
DOIs
Publication statusPublished - Apr 1992

    Fingerprint

Keywords

  • Cluster analysis
  • Protein evolution
  • Protein system
  • Protein taxonomy
  • Tripeptide method

ASJC Scopus subject areas

  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Genetics

Cite this