Gesellschaft für Informatik e.V.

Lecture Notes in Informatics


Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband P-242, 223-232 (2015).

Gesellschaft für Informatik, Bonn
2015


Copyright © Gesellschaft für Informatik, Bonn

Contents

Optimizing Sequential Pattern Mining Within Multiple Streams

Daniel Töws , Marwan Hassani , Christian Beecks and Thomas Seidl

Abstract


Analyzing information is recently becoming much more important than ever, as it is produced massively in every area. In the past years, data streams became more and more important and so were algorithms that can mine hidden patterns out of those non static data bases. Those algorithms can also be used to simulate processes and to find important information step by step. The translation of an English text into German is such a process. Linguists try to find characteristic patterns in this process to better understand it. For this purpose, keystrokes and eye movements during the process are tracked. The StrPMiner was designed to mine sequential patterns from this translation data. One dominant algorithm to find sequential patterns is the PrefixSpan. Though it was created for static data bases, lots of data stream algorithms collect batches and use the algorithm to find sequential patterns. This batch approach is a simple solution, but makes it impossible to find patterns in between two consequent batches. The PBuilder is introduced to find sequential patterns with a higher accuracy and is used by the StrPMiner to find patterns.


Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-636-7


Last changed 30.04.2015 15:50:16