We examine the extent to which, in principle, different syntactic and semantic graph representations can complement and improve neural language modeling. Specifically, by conditioning on a subgraph encapsulating the locally relevant sentence history, can a model make better next-word predictions than a pretrained sequential language model alone? With an ensemble setup consisting of GPT-2 and ground-truth graphs from one of 7 different formalisms, we find that the graph information indeed improves perplexity and other metrics. Moreover, this architecture provides a new way to compare different frameworks of linguistic representation. In our oracle graph setup, training and evaluating on English WSJ, semantic constituency structures prove most useful to language modeling performance-outpacing syntactic constituency structures as well as syntactic and semantic dependency structures.
CITATION STYLE
Prange, J., Schneider, N., & Kong, L. (2022). Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 4375–4391). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.naacl-main.325
Mendeley helps you to discover research relevant for your work.