Abstract
This paper proposes a variability normalization algorithm to reduce the variability between intra-topic documents for topic classification. Firstly, an optimization problem is constructed based on linear variability removable assumption. Secondly, a new feature space for document representation is found by solving the optimization problem with kernel principle component analysis (KPCA). Finally, effective feature transformation is taken through linear projection. As for experiments, state-of-the-art SVM and KNN algorithm are adopted for topic classification respectively. Experimental results on a free-style conversational corpus show that the proposed variability normalization algorithm for topic classification achieves 3.8% absolute improvement for micro-F1 measure.
Cite
CITATION STYLE
Liu, Q., Guo, W., Ling, Z. H., Jiang, H., & Hu, Y. (2016). Intra-topic variability normalization based on linear projection for topic classification. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (pp. 441–446). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n16-1051
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.