Abstract
Tree transducers that model expressive linguistic phenomena often require word-alignments and a heuristic rule extractor to induce their grammars. However, when the corpus of tree/string pairs is small compared to the size of the vocabulary or the complexity of the grammar, word-alignments are unreliable. We propose a general rule extraction algorithm that uses cost functions over tree fragments, and formulate the extraction as a cost minimization problem. As a by-product, we are able to introduce back-off states at which some cost functions generate right-hand-sides of previously unseen left-hand-sides, thus creating transducer rules “on-the-fly”. We test the generalization power of our induced tree transducers on a QA task over a large Knowledge Base, obtaining a reasonable syntactic accuracy and effectively overcoming the typical lack of rule coverage.
Cite
CITATION STYLE
Martínez-Gómez, P., & Miyao, Y. (2016). Rule extraction for tree-to-tree transducers by cost minimization. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 12–22). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d16-1002
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.