Gesellschaft fr Informatik e.V.

Lecture Notes in Informatics

Datenbanksysteme in Business, Technologie und Web, 11. Fachtagung des GIFachbereichs “Datenbanken und Informationssysteme” (DBIS), 2.-4. März 2005 Karlsruhe. GI 2005 P-65, 385-404 (2005).

GI, Gesellschaft für Informatik, Bonn


Gottfried Vossen, Frank Leymann, Peter Lockemann, Wolffried Stucky (eds.)

Copyright © GI, Gesellschaft für Informatik, Bonn


Maintaining nonparametric estimators over data streams

Björn Blohsfeld , Christoph Heinz and Bernhard Seeger


An effective processing and analysis of data streams is of utmost importance for a plethora of emerging applications like network monitoring, traffic management, and financial tickers. In addition to the management of transient and potentially unbounded streams, their analysis with advanced data mining techniques has been identified as a research challenge. A well-established class of mining techniques is based on nonparametric statistics where especially nonparametric density estimation is among the essential building blocks. In this paper, we examine the maintenance of nonparametric estimators over data streams. We present a tailored framework that incrementally maintains a nonparametric estimator over a data stream while consuming only a fixed amount of memory. Our framework is memory-adaptive and therefore, supports a fundamental requirement for an operator within a data stream management system. As an example, we apply our framework to selectivity estimation of range queries, which is a popular use-case for statistical estimators. After providing an analysis of the processing cost, results of experimental comparisons are reported where synthetic data streams as well as real-world ones are considered. Our results demonstrate the accuracy of the results being produced by estimators derived from our framework.

Full Text: PDF

GI, Gesellschaft für Informatik, Bonn
ISBN 3-885794-6

Last changed 24.01.2012 21:50:14