KernelTagger – a PoS Tagger for Very Small Amount of Training Data

Investor logo


This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on


Year of publication 2017
Type Article in Proceedings
Conference Proceedings of the Eleventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2017
MU Faculty or unit

Faculty of Informatics

Field Informatics
Keywords PoS tagging; morphological tagging; language model; Czech
Description The paper describes a new Part of speech (PoS) tagger which can learn a PoS tagging language model from very short annotated text with the use of much bigger non-annotated text. Only several sentences could be used for training to achieve much better accuracy than a baseline. The results cannot be compared to the results of state-of-the-art taggers but it could be used during the annotation process for a pre-annotation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.