The Influence of Preprocessing Parameters on Text Categorization

Warning

This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

POMIKÁLEK Jan ŘEHŮŘEK Radim

Year of publication 2007
Type Article in Periodical
Magazine / Source International Journal of Applied Science, Engineering and Technology
MU Faculty or unit

Faculty of Informatics

Citation
Web http://www.waset.org/pwaset/v19/v19-82.pdf
Field Informatics
Keywords machine learning; text categorization; preprocessing; feature selection
Description Results of a large scale study on mutual influence of preprocessing parameters in automated text categorization are presented and analyzed. These parameters include choice of tokenizer, feature selection, stemming, term weighing and data amount in combination with various Machine Learning algorithms.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.