Learning to parse on aligned corpora (Rough diamond)

Cezary Kaliszyk; Josef Urban; Jiří Vyskočil

Conference Proceedings

Learning to parse on aligned corpora (Rough diamond)

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9236 227-233

DOI: 10.1007/978-3-319-22102-1_15

14Citations

3Readers

Get full text

Abstract

One of the first big hurdles that mathematicians encounter when considering writing formal proofs is the necessity to get acquainted with the formal terminology and the parsing mechanisms used in the large ITP libraries. This includes the large number of formal symbols, the grammar of the formal languages and the advanced mechanisms instrumenting the proof assistants to correctly understand the formal expressions in the presence of ubiquitous overloading. In this work we start to address this problem by developing approximate probabilistic parsing techniques that autonomously train disambiguation on large corpora. Unlike in standard natural language processing, we can filter the resulting parse trees by strong ITP and AR semantic methods such as typechecking and automated theorem proving, and even let the probabilistic methods self-improve based on such semantic feedback. We describe the general motivation and our first experiments, and build an online system for parsing ambiguous formulas over the Flyspeck library.

Cite

CITATION STYLE

APA

Kaliszyk, C., Urban, J., & Vyskočil, J. (2015). Learning to parse on aligned corpora (Rough diamond). In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9236, pp. 227–233). Springer Verlag. https://doi.org/10.1007/978-3-319-22102-1_15

Learning to parse on aligned corpora (Rough diamond)

Abstract

Cite

Register to see more suggestions