Design and implementation of a library metadata management framework and its application in fuzzy data deduplication and data reconciliation with authority data
We describe the application of a generic workflow management system to the problem of metadata processing in the library domain. The requirements for such a framework and acting real-world forces are examined. The design of the framework is layed out and illustrated by means of two example workflows: fuzzy data deduplication and data reconciliation with authority data. Fuzzy data deduplication is the process of finding similar items in record collections that can not be matched by identifiers. Data reconciliation with authority data takes a data source and enhances the metadata of its records - mainly authors and subjects - with available normed data. Finally, the advantages and tradeoffs of the presented approach are discussed.
Full Text: PDF