Language geometry using random indexing

55Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Random Indexing is a simple implementation of Random Projections with a wide range of applications. It can solve a variety of problems with good accuracy without introducing much complexity. Here we demonstrate its use for identifying the language of text samples, based on a novel method of encoding letter N-grams into high-dimensional Language Vectors. Further, we show that the method is easily implemented and requires little computational power and space. As proof of the method’s statistical validity, we show its success in a language-recognition task. On a difficult data set of 21,000 short sentences from 21 different languages, we achieve 97.4% accuracy, comparable to state-of-the-art methods.

Cite

CITATION STYLE

APA

Joshi, A., Halseth, J. T., & Kanerva, P. (2017). Language geometry using random indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10106 LNCS, pp. 265–274). Springer Verlag. https://doi.org/10.1007/978-3-319-52289-0_21

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free