Enlargement of the Czech Question-Answering Dataset to SQAD v2.0

Investor logo


This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on muni.cz.


Year of publication 2017
Type Article in Proceedings
Conference Proceedings of the Eleventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2017
MU Faculty or unit

Faculty of Informatics

web http://raslan2017.nlp-consulting.net/proceedings
Field Informatics
Keywords question answering; QA dataset; SQAD
Description In this paper, we present the second version of Czech question-answering dataset called SQAD v2.0 (Simple Question Answering Database). The new version represents a large extension of our original SQAD database. In the current release, the dataset contains nearly 9,000 question-answer pairs completed with manual annotation of question and answer types. All texts in the dataset (the source documents, the question and the respective answer) are provided with complete morphological annotation in plain textual format. We offer detailed statistics of the SQAD v2.0 dataset based on the new QA annotation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

By clicking “Accept Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. Cookie Settings

Necessary Only Accept Cookies