A Distributional Multi-word Thesaurus in Sketch Engine

Year of publication 2019
Type Article in Proceedings
Conference Proceedings of the Thirteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2019
Faculty of Informatics

web https://nlp.fi.muni.cz/raslan/2019/paper17-jakubicek.pdf
Keywords text corpus; Sketch Engine; MWE; multi-word expressions; thesaurus
Description In this paper we present an extension of the current distribu-tional thesaurus as available in the Sketch Engine corpus managementsystem towards multi-word units. We explain how multi-word sketches areused to generate multi-word unit candidates, thus preserving access to theunderlying corpus texts. Finally we present sample results on the BritishNational Corpus and discuss future development as well as difficulties inevaluation.
