We describe results of a novel algorithm for grammar induction from a large corpus. The ADIOS (Automatic DIstillation of Structure) algorithm searches for significant patterns, chosen according to context dependent statistical criteria, and builds a hierarchy of such patterns according to a set of rules leading to structured generalization. The corpus is thus generalized into a context free grammar (CFG), composed of patterns, equivalence classes and words of the initial lexicon. We have evaluated our method both on corpora generated by CFG and on natural language ones. The performance of ADIOS is judged by searching for both good recall (acceptance of correct novel sentences) and good precision (production of correct novel sentences). The results are very encouraging.
Mendeley saves you time finding and organizing research
There are no full text links
Choose a citation style from the tabs below