How Many Dots Are Really Needed for Head-Driven Chart Parsing?

Warning

This publication doesn't include Faculty of Economics and Administration. It includes Faculty of Informatics. Official publication website can be found on muni.cz.

Authors	KADLEC Vladimír SMRŽ Pavel
Year of publication	2006
Type	Article in Periodical
Magazine / Source	Lecture Notes in Artificial Intelligence
MU Faculty or unit	Faculty of Informatics
Citation
Field	Informatics
Keywords	nlp; CFG; parsing
Description	This paper presents an improved form of head-driven chart parser that is appropriate for large context-free grammars. The basic method --- HDddm (Head-Driven dependent dot move) --- is introduced first. Both variants that improve the basic approach are based on the same idea --- to reduce the number of chart edges by modifying the form of items (dotted rules). The first one ``unifies'' the items that share the analyzed part of the relevant rule (thus, only one dot is needed to mark the position before and after the covered part). The second method applies the inverse strategy, it ``eliminates'' the parts that have not been covered yet (no dot needed). All the discussed alternatives are described in the form of parsing schemata. We also shortly mention a tricky technique (employing a special trie-like data structure developed originally for Scrabble) that enables to minimize the extra information needed in the algorithms. We demonstrate the advantages of the described methods by the significant decreases in the number of edges for charts. The results are given for the standard set of testing grammars (and respective inputs) as well as for a large and highly ambiguous Czech grammar.
Related projects:	Translation of Czech Sentences to Transparent Intensional Logic Constructions