Project information
LINDAT/CLARIAH-CZ - Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy
(LINDAT/CLARIAH-CZ)
- Project Identification
- LM2018101 (kod CEP: LM2018101)
- Project Period
- 1/2019 - 12/2022
- Investor / Pogramme / Project type
-
Ministry of Education, Youth and Sports of the CR
- Large Infrastructures for Research, Development and Innovation
- MU Faculty or unit
- Faculty of Arts
- Other MU Faculty/Unit
-
Faculty of Informatics
- doc. Mgr. Pavel Rychlý, Ph.D.
- Mgr. Krištof Anetta
- Mgr. Jan Bušta
- Mgr. Michal Hala
- RNDr. Ondřej Herman
- Bc. Alchemy Hríbik
- RNDr. Miloš Jakubíček, Ph.D.
- RNDr. Vojtěch Kovář, Ph.D.
- RNDr. Marek Medveď, Ph.D.
- RNDr. Zuzana Nevěřilová, Ph.D.
- Mgr. Jitka Nováčková
- prof. PhDr. Karel Pala, CSc.
- RNDr. Adam Rambousek, Ph.D.
- Mgr. Radoslav Sabol
- RNDr. Vít Suchomel, Ph.D.
- RNDr. Pavel Šmerk, Ph.D.
- Bc. Tomáš Vondrák
- Project Website
- https://digitalia.phil.muni.cz/en
- Keywords
- digital humanities
- Cooperating Organization
-
Institute of Philosophy of the ASCR, v. v. i.
Institute of History of the ASCR, v. v. i.
Library of the ASCR, v. v. i.
Institute of the Czech Language of the ASCR, v. v. i.
The Moravian Library Brno
National Gallery Prague
National Library of the CR
Charles University
- Responsible person prof. RNDr. Jan Hajič, Dr.
National Cinematographic Archive
The LINDAT/CLARIAH‐CZ Research Infrastructure is planned as a new addition to the Czech Republic’s Research Infrastructure Roadmap, as a national node of the pan‐European DARIAH‐EU network. It will bring the key institutions in the Czech Republic to the European Digital Humanities landscape, and it is foreseen that Czech Republic will become a full member of DARIAH ERIC, the governing body of the network, which has formally been established on Aug. 15, 2014. LINDAT/CLARIAH‐CZ RI is currently in its preparatory phase. The RI will enter its construction phase at the beginning of the funding phase, i.e. at the beginning of 2019. It is assumed, based on previous experience of similar SSH RIs, such as CLARIN or other DARIAH nodes in Europe, that the construction phase will have to last for about two years (until the end of 2021). The last year of the construction phase (2021), with the planned equipment being installed, will be devoted to testing and gradual opening of services to the public. Starting in 2022, the LINDAT/CLARIAH‐CZ will enter its operational phase; data collection and development of services will continue throughout.
Sustainable Development Goals
Masaryk University is committed to the UN Sustainable Development Goals, which aim to improve the conditions and quality of life on our planet by 2030.
Publications
Total number of publications: 54
2021
-
Website Properties in Relation to the Quality of Text Extracted for Web Corpora
Recent Advances in Slavonic Natural Language Processing (RASLAN 2021), year: 2021
-
When Tesseract Brings Friends: Layout Analysis, Language Identification, and Super-Resolution in the Optical Character Recognition of Medieval Texts
Recent Advances in Slavonic Natural Language Processing (RASLAN 2021), year: 2021
-
When Word Pairs Matter - Analysis of the English-Slovak Evaluation Dataset
Proceedings of the Fifteenth Workshop on Recent Advances in Slavonic Natural Languages Processing, RASLAN 2021, year: 2021
-
Who is Selling to Whom – Feature Evaluation for Multi-block Classification in Invoice Information Extraction
SPECOM 2021: 23rd International Conference on Speech and Computer, year: 2021
2020
-
Current Challenges in Web Corpus Building
Proceedings of the 12th Web as Corpus Workshop, year: 2020
-
Data Mining from Free-Text Health Records : State of the Art, New Polish Corpus
Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, year: 2020
-
Digging for unspecified information requirements : a case study of Digital Library of Arne Novák users
Year: 2020, type: Appeared in Conference without Proceedings
-
Digitální data perspektivou humanitního vědce
Year: 2020, type: Conference
-
Multilingual Recognition of Temporal Expressions
Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, year: 2020
-
Removing Spam from Web Corpora Through Supervised Learning and Semi-manual Classification of Web Sites
Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020, year: 2020