Measuring the quality of approximated clusterings
Clustering has become an increasingly important task in modern application domains. In many areas, e.g. when clustering complex objects, in distributed clustering, or when clustering mobile objects, due to technical, security, or efficiency reasons it is not possible to compute an â€śoptimalâ€ť clustering. Recently a lot of research has been done on efficiently computing approximated clusterings. Here, the crucial question is, how much quality has to be sacrificed for the achieved gain in efficiency. In this paper, we present suitable quality measures allowing us to compare approximated clusterings with reference clusterings. We first introduce a quality measure for clusters based on the symmetric set difference. Using this distance function between single clusters, we introduce a quality measure based on the minimum weight perfect matching of sets for comparing partitioning clusterings, as well as a quality measure based on the degree-2 edit distance for comparing hierarchical clusterings.
Full Text: PDF