Using term extraction patterns to discover coherent relationships from open source intelligence

William L. Sousan, Qiuming Zhu, Robin Gandhi, William Mahoney, Anup Sharma

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Unstructured open source information, especially the social, political, economic and cultural events described within web-based text/news articles, often contain possible motives for cyber security and trust issues. Automated processing of numerous open source intelligence sources requires the discovery of key domain terms, their conceptual hierarchies and the coherent relationships among them. A syntactic analysis of the word sequences in unstructured text documents allows for the extraction of subject-predicate-object triples, which form the basis for Term Extraction Patterns (TEP). In our research, we use TEPs to discover domain-specific multi-word entities which in turn, can be arranged in a taxonomy based on their semiotic inter-relationships. We explore the use of this method within the cyber security domain and analyze a collection of related news articles gathered from various public web sources. In this paper our initial results of term extraction and the semantic coherence derived from the TEP analyses are described. Our work extends beyond current methods, and our contribution is a novel methodology to extract semantics from unstructured text in domain specific open source information and its application to predict cyber attack outbreaks.

Original languageEnglish (US)
Title of host publicationProceedings - SocialCom 2010
Subtitle of host publication2nd IEEE International Conference on Social Computing, PASSAT 2010: 2nd IEEE International Conference on Privacy, Security, Risk and Trust
Pages967-972
Number of pages6
DOIs
StatePublished - 2010
Event2nd IEEE International Conference on Social Computing, SocialCom 2010, 2nd IEEE International Conference on Privacy, Security, Risk and Trust, PASSAT 2010 - Minneapolis, MN, United States
Duration: Aug 20 2010Aug 22 2010

Publication series

NameProceedings - SocialCom 2010: 2nd IEEE International Conference on Social Computing, PASSAT 2010: 2nd IEEE International Conference on Privacy, Security, Risk and Trust

Conference

Conference2nd IEEE International Conference on Social Computing, SocialCom 2010, 2nd IEEE International Conference on Privacy, Security, Risk and Trust, PASSAT 2010
Country/TerritoryUnited States
CityMinneapolis, MN
Period8/20/108/22/10

Keywords

  • Conceptualization
  • Open source intelligence
  • Semantic relevance
  • Term extraction
  • Term extraction patterns

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Using term extraction patterns to discover coherent relationships from open source intelligence'. Together they form a unique fingerprint.

Cite this