Multimodal Aspect Extraction with Region-Aware Alignment Network

Hanqian Wu; Siliang Cheng; Jingjing Wang; Shoushan Li; Lian Chi

Conference Proceedings

Multimodal Aspect Extraction with Region-Aware Alignment Network

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12430 LNAI 145-156

DOI: 10.1007/978-3-030-60450-9_12

17Citations

4Readers

Get full text

Abstract

Fueled by the rise of social media, documents on these platforms (e.g., Twitter, Weibo) are increasingly multimodal in nature, with images in addition to text. To well automatically analyze the opinion information inside multimodal data, it’s crucial to perform aspect term extraction (ATE) on them. However, until now, the researches focus on multimodal ATE are rare. In this study, we take a step further than previous studies by proposing a Region-aware Alignment Network (RAN) that aligns text with object regions that show in an image for the multimodal ATE task. Experiments on the Twitter dataset showcase the effectiveness of our proposed model. Further researches prove that our model has better performance when extracting emotion polarized aspect terms.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, H., Cheng, S., Wang, J., Li, S., & Chi, L. (2020). Multimodal Aspect Extraction with Region-Aware Alignment Network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12430 LNAI, pp. 145–156). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60450-9_12

Multimodal Aspect Extraction with Region-Aware Alignment Network

Abstract

Author supplied keywords

Cite

Register to see more suggestions