Extracting temporal equivalence relationships among keywords from time-stamped documents

Parvathi Chundi, Mahadevan Subramaniam, R. M.Aruna Weerakoon

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Identifying keyword associations from text and search sources is often used to facilitate many tasks such as understanding relationships among concepts, extracting relevant documents, matching advertisements to web pages, expanding user queries, etc. However, these keyword associations change as the underlying content changes with time. Two keywords that are associated with each other during one time period may not be associated in another time period or the context under which these keywords are associated may be different. In this paper, we define an equivalence relationship among a pair of keywords and develop methods to construct a temporal view of the equivalence relationship. Given a document set D, a keyword a is associated with a context consisting of frequently occurring keyword sets (f s ) of D in which a appears. Two keywords a and b are equivalent in D if their contexts are the same. We say that a and b are temporally equivalent in a time interval if a and b are equivalent in the documents published during that time interval. Given a time-stamped document set D published over a time period T, we define the temporal equivalence partitioning problem to construct a partitioning of the time period T into a sequence of maximal length time intervals such that in each time interval keywords a and b are either temporally equivalent or the equivalence relationship does not hold. A temporal equivalence partitioning of a document set for a given pair of keywords highlights all of the different contexts in which the given keywords are associated which can be used to generate time-varying keyword suggestions to users. We show the effectiveness of the approach by constructing the temporal equivalence partitionings of several pairs of keywords from the Multi-Domain Sentiment data set and the ICWSM 2009 Spinn3r data set.

Original languageEnglish (US)
Title of host publicationDatabase and Expert Systems Applications - 22nd International Conference, DEXA 2011, Proceedings
Pages110-124
Number of pages15
EditionPART 1
DOIs
StatePublished - 2011
Event22nd International Conference on Database and Expert Systems Applications, DEXA 2011 - Toulouse, France
Duration: Aug 29 2011Sep 2 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6860 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd International Conference on Database and Expert Systems Applications, DEXA 2011
Country/TerritoryFrance
CityToulouse
Period8/29/119/2/11

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Extracting temporal equivalence relationships among keywords from time-stamped documents'. Together they form a unique fingerprint.

Cite this