Gesellschaft für Informatik e.V.

Lecture Notes in Informatics


Informatik 2004, Informatik verbindet, Band 1, Beiträge der 34. Jahrestagung der Gesellschaft für Informatik e.V. (GI), Ulm, 20. - 24. September 2004 P-50, 259-263 (2004).

GI, Gesellschaft für Informatik, Bonn
2004


Editors

Peter Dadam, Manfred Reichert (eds.)


Copyright © GI, Gesellschaft für Informatik, Bonn

Contents

Assessing the quality of natural language text data

Daniel Sonntag

Abstract


We follow an empirical approach from data quality toward text quality, where the expectations of the consumer, human or machine, take the centre stage. We try to obtain numerical text quality statements which must be interpreted for the expectations of the user and suitability for automatic natural language processing (NLP) separately. We state that apart from text accessibility today only representational text quality metrics can be derived and computed automatically. Interestingly, text quality for NLP traces back to questions of text representation.


Full Text: PDF

GI, Gesellschaft für Informatik, Bonn
ISBN 3-88579-379-2


Last changed 24.01.2012 21:46:47