On identifying and analyzing significant nodes in protein-protein interaction networks

Rohan Khazanchi, Kathryn Dempsey, Ishwor Thapa, Hesham Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as 'sparseness'. Overall, structurally important nodes were found to have significant biological relevance.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013
PublisherIEEE Computer Society
Pages343-348
Number of pages6
DOIs
StatePublished - 2013
Event2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013 - Dallas, TX
Duration: Dec 7 2013Dec 10 2013

Other

Other2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013
CityDallas, TX
Period12/7/1312/10/13

Keywords

  • Clustering
  • Driver nodes
  • Graph theory
  • Hub nodes
  • Network enrichment
  • Protein-protein interaction networks

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'On identifying and analyzing significant nodes in protein-protein interaction networks'. Together they form a unique fingerprint.

Cite this