Abstract
Network theory has been used for modeling biological data as well as social networks, transportation logistics, business transcripts, and many other types of data sets. Identifying important features/parts of these networks for a multitude of applications is becoming increasingly significant as the need for big data analysis techniques grows. When analyzing a network of protein-protein interactions (PPIs), identifying nodes of significant importance can direct the user toward biologically relevant network features. In this work, we propose that a node of structural importance in a network model can correspond to a biologically vital or significant property. This relationship between topological and biological importance can be seen in/between structurally defined nodes, such as hub nodes and driver nodes, within a network and within clusters. This work proposes data mining approaches for identification and examination of relationships between hub and driver nodes within human, yeast, rat, and mouse PPI networks. Relationships with other types of significant nodes, with direct neighbors, and with the rest of the network were analyzed to determine if the model can be characterized biologically by its structural makeup. We performed numerous tests on structure with a data-driven mentality, looking for properties that were potentially significant on a network level and then comparing those properties to biological significance. Our results showed that identifying and cross-referencing different types of topologically significant nodes can exemplify properties such as transcription factor enrichment, lethality, clustering, and Gene Ontology (GO) enrichment. Mining the biological networks, we discovered a key relationship between network properties and how sparse/dense a network is-a property we described as 'sparseness'. Overall, structurally important nodes were found to have significant biological relevance.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - IEEE 13th International Conference on Data Mining Workshops, ICDMW 2013 |
Publisher | IEEE Computer Society |
Pages | 343-348 |
Number of pages | 6 |
DOIs | |
State | Published - 2013 |
Event | 2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013 - Dallas, TX Duration: Dec 7 2013 → Dec 10 2013 |
Other
Other | 2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013 |
---|---|
City | Dallas, TX |
Period | 12/7/13 → 12/10/13 |
Keywords
- Clustering
- Driver nodes
- Graph theory
- Hub nodes
- Network enrichment
- Protein-protein interaction networks
ASJC Scopus subject areas
- Software