Abstract
In order to better utilize of the semantic information contained in the part of speech of words and the contextual information of unnatural language accompanying the appearance of words, a part of speech weighted multimodal sentiment analysis model with dynamic adjustment of semantic representation (PM-DS) is proposed. The PM-DS model takes natural language as the main body, and uses bidirectional encoder representation from transformer model, generalized autoregressive pre-training model for language understanding (XLNet) and a robustly optimized BERT pretraining approach (RoBERTa) to embed words into text patterns, respectively. A dynamic semantic adjustment module is created to effectively combine natural language and unnatural language information. The part of speech weighting module is designed to extract the part of speech of words and assigned weights to optimize sentiment discrimination. Comparative experimental results with the current advanced models such as tensor fusion network and low-rank multimodal fusion show that the average absolute errors of PW-DS model on public data sets CMU-MOSI and CMU-MOSEI are 0. 607 and 0. 510, respectively, and the binary classification accuracies are 89. 02% and 86. 93%, respectively, which is better than the models in the comparative experiments. The effects of different modules on the model are also analyzed through ablation experiments. The experimental results demonstrate that the proposed model is effective to deal with the problem of multi-modal emotion analysis.
Author supplied keywords
Cite
CITATION STYLE
Hua, Q., Chen, Z., Zhang, F., & Dong, C. (2024). Part of speech weighted multi-modal emotion analysis model with dynamic adjustment of semantic representation. Shenzhen Daxue Xuebao (Ligong Ban)/Journal of Shenzhen University Science and Engineering, 41(3), 283–292. https://doi.org/10.3724/SP.J.1249.2024.03283
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.