Reducing infrequent-token perplexity via variational corpora

Yusheng Xie; Pranjal Daga; Yu Cheng; Kunpeng Zhang; Ankit Agrawal; Alok Choudhary

Conference Proceedings

Reducing infrequent-token perplexity via variational corpora

ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (2015) 2 609-615

DOI: 10.3115/v1/p15-2101

7Citations

85Readers

Get full text

Abstract

Recurrent neural network (RNN) is recognized as a powerful language model (LM). We investigate deeper into its performance portfolio, which performs well on frequent grammatical patterns but much less so on less frequent terms. Such portfolio is expected and desirable in applications like autocomplete, but is less useful in social content analysis where many creative, unexpected usages occur (e.g., URL insertion). We adapt a generic RNN model and show that, with variational training corpora and epoch unfolding, the model improves its performance for the task of URL insertion suggestions.

Cite

CITATION STYLE

APA

Xie, Y., Daga, P., Cheng, Y., Zhang, K., Agrawal, A., & Choudhary, A. (2015). Reducing infrequent-token perplexity via variational corpora. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 609–615). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-2101

Reducing infrequent-token perplexity via variational corpora

Abstract

Cite

Register to see more suggestions