Authorship Verification based on Syntax Features

Varování

Publikace nespadá pod Ekonomicko-správní fakultu, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.
Autoři

RYGL Jan ZEMKOVÁ Kristýna KOVÁŘ Vojtěch

Rok publikování 2012
Druh Článek ve sborníku
Konference Proceedings of the Sixth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2012
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
www
Obor Jazykověda
Klíčová slova authorship verification;syntactic analysis;SET;machine learning
Popis Authorship verification is wildly discussed topic at these days. In the authorship verification problem, we are given examples of the writing of an author and are asked to determine if given texts were or were not written by this author. In this paper we present an algorithm using syntactic analysis system SET for verifying authorship of the documents. We propose three variants of two-class machine learning approach to authorship verification. Syntactic features are used as attributes in suggested algorithms and their performance is compared to established word-lenth distribution features. Results indicate that syntactic features provide enough information to improve accuracy of authorship verification algorithms.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.