Optimizing Sequential Pattern Mining Within Multiple Streams
Analyzing information is recently becoming much more important than ever, as it is produced massively in every area. In the past years, data streams became more and more important and so were algorithms that can mine hidden patterns out of those non static data bases. Those algorithms can also be used to simulate processes and to find important information step by step. The translation of an English text into German is such a process. Linguists try to find characteristic patterns in this process to better understand it. For this purpose, keystrokes and eye movements during the process are tracked. The StrPMiner was designed to mine sequential patterns from this translation data. One dominant algorithm to find sequential patterns is the PrefixSpan. Though it was created for static data bases, lots of data stream algorithms collect batches and use the algorithm to find sequential patterns. This batch approach is a simple solution, but makes it impossible to find patterns in between two consequent batches. The PBuilder is introduced to find sequential patterns with a higher accuracy and is used by the StrPMiner to find patterns.
Full Text: PDF