Measuring language complexity using word embeddings

Peter A. Whigham; Mansi Chugh; Grant Dick

Conference Proceedings

Measuring language complexity using word embeddings

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11320 LNAI 843-854

DOI: 10.1007/978-3-030-03991-2_76

1Citations

5Readers

Get full text

Abstract

The analysis of word patterns from a corpus has previously been examined using a number of different word embedding models. These models create a numeric representation of word co-occurrence and are able to capture some of the syntactic and semantic relationships of words in a document. Assessing language complexity has been considered for many years through the use of simple indexes and basic statistical properties (word frequency, etc.), however little work has been done on using word embeddings to develop language complexity measures. This paper describes preliminary work on measuring language complexity using clustered word embeddings to produce network transition models. The structural measures of these transition networks are shown to represent basic properties of language complexity and may be used to infer some aspects of the underlying generative grammar.

Author supplied keywords

Cite

CITATION STYLE

APA

Whigham, P. A., Chugh, M., & Dick, G. (2018). Measuring language complexity using word embeddings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11320 LNAI, pp. 843–854). Springer Verlag. https://doi.org/10.1007/978-3-030-03991-2_76

Measuring language complexity using word embeddings

Abstract

Author supplied keywords

Cite

Register to see more suggestions