Unlabeled Short Text Similarity with LSTM Encoder

Lin Yao; Zhengyu Pan; Huansheng Ning

Journal ArticleOPEN ACCESS

Unlabeled Short Text Similarity with LSTM Encoder

IEEE Access (2019) 7 3430-3437

DOI: 10.1109/ACCESS.2018.2885698

25Citations

31Readers

Abstract

Short texts play an important role in our daily communication. It has been applied in many fields. In this paper, we propose a novel short text similarity measurement algorithm-based long short-Term memory (LSTM) encoder. It contains preprocessing, training, and evaluating stages. Our preprocessing algorithm can avoid gradient vanishing problems in the process of backward propagation faster after normalization. The training stage fully leverages the inception module to extract the features of different dimensions and improves the LSTM network to process the relationships of word sequences. The evaluating stage employs cosine distance to calculate the semantic similarity of two short texts. We do experiments on two short text dataset of different lengths and analyze the experiment result. The experiment result shows that our algorithm can fully employ semantic information and sequence information of short texts and have a higher accuracy and recall compared to other short text similarity measurement algorithms.

Author supplied keywords

Cite

CITATION STYLE

APA

Yao, L., Pan, Z., & Ning, H. (2019). Unlabeled Short Text Similarity with LSTM Encoder. IEEE Access, 7, 3430–3437. https://doi.org/10.1109/ACCESS.2018.2885698

Unlabeled Short Text Similarity with LSTM Encoder

Abstract

Author supplied keywords

Cite

Register to see more suggestions