Gesellschaft für Informatik e.V.

Lecture Notes in Informatics


Datenbanksysteme für Business, Technologie und Web (BTW 2015) - Workshopband P-242, 21-36 (2015).

Gesellschaft für Informatik, Bonn
2015


Copyright © Gesellschaft für Informatik, Bonn

Contents

On performance optimization potentials regarding data classification in forensics

Veit Köppen , Mario Hildebrandt and Martin Schäler

Abstract


Classification of given data sets according to a training set is one of the essentials bread and butter tools in machine learning. There are several application scenarios, reaching from the detection of spam and non-spam mails to recognition of malicious behavior, or other forensic use cases. To this end, there are several approaches that can be used to train such classifiers. Often, scientists use machine learning suites, such as WEKA, ELKI, or RapidMiner in order to try different classifiers that deliver best results. The basic purpose of these suites is their easy application and extension with new approaches. This, however, results in the property that the implementation of the classifier is and cannot be optimized with respect to response time. This is due to the different focus of these suites. However, we argue that especially in basic research, systematic testing of different promising approaches is the default approach. Thus, optimization for response time should be taken into consideration as well, especially for large scale data sets as they are common for forensic use cases. To this end, we discuss in this paper, in how far well-known approaches from databases can be applied


Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-636-7


Last changed 30.04.2015 15:50:09