Slovenské vzory dělení: čas pro změnu?
Title in English | Slovak Hyphenation: a Time for Change? |
---|---|
Authors | |
Year of publication | 2004 |
Type | Article in Proceedings |
Conference | SLT 2004, sborník 4. ročníku semináře o Linuxu a TeXu |
MU Faculty or unit | |
Citation | |
Web | |
Field | Use of computers, robotics and its application |
Keywords | Slovak hyphenation; electronic publishing; segmentation; stratification; bootstrapping |
Description | Hyphenation, or more generally algorithmic segmentation of big wordlist of some language is frequent problem. For Slovak language, there is only version based on the syllable principle available, without coverage of many exceptions. From a wordlist of million collected words we have generated by the PatGen program new freely available patterns that fill this gap. The result is directly usable not only in TeX distributions, but in other systems as well (OpenOffice.org). The techniques of bootstrapping, stratification and patterns generation are handy for solution of plenty of various segmentation tasks. |
Related projects: |