Performances in sentiment analysis - the crucial task of automatically classifying the huge amount of users’ opinions generated online - heavily rely on the representation used to transform words or sentences into numbers. In the field of machine learning for sentiment analysis the most common embedding is the bag of words (BOW) model, which works well in practice but which is essentially a lexical conversion. Another well-known method is the Word2vec approach which, instead, attempts to capture the meaning of the terms. Given the complementarity of the information encoded in the two models, the knowledge offered by Word2vec can be helpful to enrich the information comprised in the BOW scheme. Based on this assumption we designed and tested four hybrid sentence representations which combine the two former approaches. Experiments performed on publicly available datasets confirm the effectiveness of the hybrid embeddings which led to a stable increase in the performances across different sentiment analysis domains.
CITATION STYLE
Orsenigo, C., Vercellis, C., & Volpetti, C. (2018). Concatenating or Averaging? Hybrid Sentences Representations for Sentiment Analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11314 LNCS, pp. 567–575). Springer Verlag. https://doi.org/10.1007/978-3-030-03493-1_59
Mendeley helps you to discover research relevant for your work.