Distributed representations of words (aka word embedding) have proven helpful in solving natural language processing (NLP) tasks. Training distributed representations of words with neural networks has lately been a major focus of re-searchers in the field. Recent work on word embedding, the Continuous Bag-of-Words (CBOW) model and the Contin-uous Skip-gram (Skip-gram) model, have produced particu-larly impressive results, significantly speeding up the training process to enable word representation learning from large-scale data. However, both CBOW and Skip-gram do not pay enough attention to word proximity in terms of model or word ambiguity in terms of linguistics. In this paper, we propose Proximity-Ambiguity Sensitive (PAS) models (i.e. PAS CBOW and PAS Skip-gram) to produce high quality distributed representations of words considering both word proximity and ambiguity. From the model perspective, we in-troduce proximity weights as parameters to be learned in PAS CBOW and used in PAS Skip-gram. By better modeling word proximity, we reveal the strength of pooling-structured neu-ral networks in word representation learning. The proximity-sensitive pooling layer can also be applied to other neural net-work applications that employ pooling layers. From the lin-guistics perspective, we train multiple representation vectors per word. Each representation vector corresponds to a partic-ular group of POS tags of the word. By using PAS models, we achieved a 16.9% increase in accuracy over state-of-the-art models.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below