Gesellschaft für Informatik e.V.

Lecture Notes in Informatics

Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband P-242, 21-36 (2015).

Gesellschaft für Informatik, Bonn

Copyright © Gesellschaft für Informatik, Bonn


On performance optimization potentials regarding data classification in forensics

Veit Köppen , Mario Hildebrandt and Martin Schäler


Classification of given data sets according to a training set is one of the essentials bread and butter tools in machine learning. There are several application scenarios, reaching from the detection of spam and non-spam mails to recognition of malicious behavior, or other forensic use cases. To this end, there are several approaches that can be used to train such classifiers. Often, scientists use machine learning suites, such as WEKA, ELKI, or RapidMiner in order to try different classifiers that deliver best results. The basic purpose of these suites is their easy application and extension with new approaches. This, however, results in the property that the implementation of the classifier is and cannot be optimized with respect to response time. This is due to the different focus of these suites. However, we argue that especially in basic research, systematic testing of different promising approaches is the default approach. Thus, optimization for response time should be taken into consideration as well, especially for large scale data sets as they are common for forensic use cases. To this end, we discuss in this paper, in how far well-known approaches from databases can be applied

Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-636-7

Last changed 30.04.2015 15:50:09