European Union Language Resources in Sketch Engine
Authors | |
---|---|
Year of publication | 2016 |
Type | Article in Proceedings |
Conference | Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) |
MU Faculty or unit | |
Citation | |
Web | http://www.lrec-conf.org/proceedings/lrec2016/pdf/572_Paper.pdf |
Field | Informatics |
Keywords | JRC-Acquis; DCEP; DGT-TM; Europarl; EUR-Lex; Sketch Engine; parallel corpus; word sketch; parallel concordance |
Description | Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the Sketch Engine corpus management system. A completely new resource is introduced: EUR-Lex corpus, being one of the largest parallel corpus available at the moment, containing 840 million tokens of English and having the largest language pair (English-French) with more than 25 million aligned segments (paragraphs). |
Related projects: |