Gesellschaft für Informatik e.V.

Lecture Notes in Informatics

12^th international conference on innovative Internet community services (I^2CS 2012) P-204, 202-211 (2012).

Gesellschaft für Informatik, Bonn

Copyright © Gesellschaft für Informatik, Bonn


Detecting source topics by analysing directed co-occurrence graphs

Mario Kubek and Herwig Unger


This paper describes a new method to determine the sources of topics, that influence the main topics in texts, by analysing directed co-occurrence graphs using an extended version of the HITS algorithm. Additionally, this method can be used to identify characteristic terms in texts. In order to obtain the needed directed term relations the notion of term association is introduced to cover asymmetric reallife relationships between concepts and it is described how they can be calculated by statistical means. In the experiments, it is shown that the detected source topics and the characteristic terms can be used to find similar documents and documents that mainly deal with them in large corpora like the World Wide Web. In doing so iteratively, it is possible to easily follow topics by analysing documents from these corpora using this method. This way, users can be offered this new search function in interactive search systems that goes beyond a simple presentation of similar documents. This application will be elaborated on as well.

Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-298-7

Last changed 04.10.2013 18:38:20