Gesellschaft für Informatik e.V.

Lecture Notes in Informatics


Natural Language Processing and Information Systems, 8th International Conference on Applications of Natural Language to Information Systems, June 2003, Burg (Spreewald), Germany. P-29, 141-154 (2003).

GI, Gesellschaft für Informatik, Bonn
2003


Editors

Antje Düsterhöft (ed.), Bernhard Thalheim (ed.)


Copyright © GI, Gesellschaft für Informatik, Bonn

Contents

Approaches to feature selection for document categorization

Huaizhong Kou , Georges Gardarin and Karina Zeitouni

Abstract


One of the problems faced by document categorization is that terms present in the collection of example documents are numerous. From the point of view of coherence between the models used in document categorization, we analyses the frameworks of both k-NN and NB categorization models and feature selection problem. Two algorithms CBA and IBA to feature selection are proposed. The empirical results done with k-NN and NB classifiers show that the coherence between models in the categorization system can bring benefits for performance.


Full Text: PDF

GI, Gesellschaft für Informatik, Bonn
ISBN 3-88579-358-X


Last changed 04.10.2013 17:57:44