The aim of vocabulary inventory prediction is to predict a learner's whole vocabulary based on a limited sample of query words. This paper approaches the problem starting from the 2-parameter Item Response Theory (IRT) model, giving each word in the vocabulary a difficulty and discrimination parameter. The discrimination parameter is evaluated on the sub-problem of question item selection, familiar from the fields of Computerised Adaptive Testing (CAT) and active learning. Next, the effect of the discrimination parameter on prediction performance is examined, both in a binary classification setting, and in an information retrieval setting. Performance is compared with baselines based on word frequency. A number of different generalisation scenarios are examined, including generalising word difficulty and discrimination using word embeddings with a predictor network and testing on out-of-dataset data.
CITATION STYLE
Robertson, F. (2021). Word Discriminations for Vocabulary Inventory Prediction. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 1188–1195). Incoma Ltd. https://doi.org/10.26615/978-954-452-072-4_134
Mendeley helps you to discover research relevant for your work.