On deriving net change information from change logs - the DELTALAYER-algorithm
The management of change logs is crucial in different areas of information systems like data replication, data warehousing, and process management. One barrier that hampers the (intelligent) use of respective change logs is the possibly large amount of unnecessary and redundant data provided by them. In particular, change logs often contain information about changes which actually have had no effect on the original data source (e.g., due to subsequently applied, overriding change operations). Typically, such inflated logs lead to difficulties with respect to system performance, data quality or change comparability. In order to deal with this we introduce the DeltaLayer algorithm. It takes arbitrary change log information as input and produces a cleaned output which only contains the net change effects; i.e., the produced log only contains information about those changes which actually have had an effect on the original source. We formally prove the minimality of our algorithm, and we show how it can be applied in different domains; e.g., the post-processing of differential snapshots in data warehouses or the analysis of conflicting changes in process management systems. Altogether the ability to purge change logs from unnecessary information provides the basis for a more intelligent handling of these logs.
Full Text: PDF