Abstract
Vector space models (VSMs) are mathematically well-defined frameworks that have been widely used in the distributional approaches to semantics. In VSMs, highdimensional vectors represent linguistic entities. In an application, the similarity of vectors-and thus the entities that they represent-is computed by a distance formula. The high dimensionality of vectors, however, is a barrier to the performance of methods that employ VSMs. Consequently, a dimensionality reduction technique is employed to alleviate this problem. This paper introduces a novel technique called Random Manhattan Indexing (RMI) for the construction of 1 normed VSMs at reduced dimensionality. RMI combines the construction of a VSM and dimension reduction into an incremental and thus scalable two-step procedure. In order to attain its goal, RMI employs the sparse Cauchy random projections. We further introduce Random Manhattan Integer Indexing (RMII): A computationally enhanced version of RMI. As shown in the reported experiments, RMI and RMII can be used reliably to estimate the 1 distances between vectors in a vector space of low dimensionality.
Cite
CITATION STYLE
Zadeh, B. Q., & Handschuh, S. (2014). Random Manhattan integer indexing: Incremental L1 normed vector space construction. In EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 1713–1723). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/d14-1178
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.