Unlabeled Short Text Similarity with LSTM Encoder

25Citations
Citations of this article
31Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-Term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms.

Cite

CITATION STYLE

APA

Yao, L., Pan, Z., & Ning, H. (2019). Unlabeled Short Text Similarity with LSTM Encoder. IEEE Access, 7, 3430–3437. https://doi.org/10.1109/ACCESS.2018.2885698

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free