Domain-dependence limits the application of a well-trained sentiment classifier based on one domain data in other different domains. To solve this problem, multi-domain sentiment classification has received great attention recently. It aims to construct a domain-specific sentiment classifier at once from datasets of multi-domains. However, research on multi-domain sentiment classification mainly focuses on high-resource languages, and there is no research on Indonesian multi-domain sentiment classification. To fill the gap, we constructed an Indonesian multi-domain dataset, including 489,000 reviews from four domains with three sentiment polarities (positive, neutral, and negative), and proposed an integrated model for Indonesian multi-domain sentiment classification. This model is consisted of lemmatization layer, domain-general module, domain-specific module, and domain classifier module. Based on the Indonesian multi-domain dataset, the model was evaluated and compared with baseline methods commonly used in the sentiment analysis of high-resource languages. The effectiveness of some essential components in the model was also verified. The model achieved an average weighted F1 over four domains with 87.24%, outperforming the baseline methods and demonstrating its effectiveness.
CITATION STYLE
Lin, N., Chen, B., Fu, S., Lin, X., & Jiang, S. (2020). Multi-domain Sentiment Classification on Self-constructed Indonesian Dataset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12430 LNAI, pp. 789–801). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60450-9_62
Mendeley helps you to discover research relevant for your work.