Primary structure similarity analysis of proteins sequences by a new graphical representation

S. C. Xu, Z. Li, S. P. Zhang, J. L. Hu

A new graphical description of the primary structure of protein sequences is introduced. First, a three-dimensional space discrete point set of a protein sequence is created based on the three main physicochemical properties of the amino acids. Secondly, a continuous cubic B-spline curve interpolating the amino acid points is constructed to represent the shape of the protein sequence. Then the geometric properties (curvature and torsion) of the continuous curve are extracted for the purpose of analyzing the similarity between protein sequences. Finally, an improved Canberra distance comparison is introduced for the similarity analysis of protein sequences with different lengths. Experimental results show that our method is effective for the similarity comparison of protein sequences.

Original languageEnglish (US)
Pages (from-to)791-803
Number of pages13
JournalSAR and QSAR in Environmental Research
Issue number10
StatePublished - Oct 2014


  • cubic B-spline curve
  • curvature
  • protein sequence
  • shape analysis
  • torsion

ASJC Scopus subject areas

  • Bioengineering
  • Molecular Medicine
  • Drug Discovery

