Scalable semi-supervised query classification using matrix sketching

15Citations
Citations of this article
92Readers
Mendeley users who have this article in their library.

Abstract

The enormous scale of unlabeled text available today necessitates scalable schemes for representation learning in natural language processing. For instance, in this paper we are interested in classifying the intent of a user query. While our labeled data is quite limited, we have access to virtually an unlimited amount of unlabeled queries, which could be used to induce useful representations: for instance by principal component analysis (PCA). However, it is prohibitive to even store the data in memory due to its sheer size, let alone apply conventional batch algorithms. In this work, we apply the recently proposed matrix sketching algorithm to entirely obviate the problem with scalability (Liberty, 2013). This algorithm approximates the data within a specified memory bound while preserving the covariance structure necessary for PCA. Using matrix sketching, we significantly improve the user intent classification accuracy by leveraging large amounts of unlabeled queries.

Cite

CITATION STYLE

APA

Kim, Y. B., Stratos, K., & Sarikaya, R. (2016). Scalable semi-supervised query classification using matrix sketching. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Short Papers (pp. 8–13). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-2002

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free