Multimodal Deep Learning Framework for Book Recommendations: Harnessing Image Processing with VGG16 and Textual Analysis via LSTM-Enhanced Word2Vec

Yanling Li; Xin Li; Qian Zhao

Journal ArticleOPEN ACCESS

Multimodal Deep Learning Framework for Book Recommendations: Harnessing Image Processing with VGG16 and Textual Analysis via LSTM-Enhanced Word2Vec

Traitement du Signal (2023) 40(4) 1367-1376

DOI: 10.18280/ts.400406

2Citations

11Readers

Abstract

In the contemporary digital age, an intensified emphasis has been placed on the research of book recommendation systems. Historically, these systems predominantly focused on readers' past preferences, overlooking the inherent characteristics of the book's content and design. To address this gap, a novel algorithm, leveraging both multimodal image processing and deep learning, was designed. Features from book cover images were extracted using the VGG16 model, while textual attributes were discerned through a combination of the Word2Vec model and LSTM neural networks. The integration of the CBAM attention mechanism culminated in the creation of a modality-weighted feature fusion module, facilitating the dynamic allocation of feature weights. Furthermore, an objective function for this recommendation model was formulated, ensuring the enhancement of its performance during the training phase. This study not only presents a groundbreaking methodology to amplify the efficacy and resilience of book recommendation systems but also broadens understanding in the realm of multimodal information processing within deep learning-based recommendation platforms.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Li, Y., Li, X., & Zhao, Q. (2023). Multimodal Deep Learning Framework for Book Recommendations: Harnessing Image Processing with VGG16 and Textual Analysis via LSTM-Enhanced Word2Vec. Traitement Du Signal, 40(4), 1367–1376. https://doi.org/10.18280/ts.400406

Readers' Discipline

Engineering 1

100%

Multimodal Deep Learning Framework for Book Recommendations: Harnessing Image Processing with VGG16 and Textual Analysis via LSTM-Enhanced Word2Vec

Abstract

Author supplied keywords

References Powered by Scopus

Automating readers' advisory to make book recommendations for K-12 readers

Integrating image and textual information in human–robot interactions for children with autism spectrum disorder

Can book covers help predict bestsellers using machine learning approaches?

Cited by Powered by Scopus

Exploiting diffusion-based structured learning for item interactions representations in multimodal recommender systems

Leveraging Deep Learning for Personalized Book Recommendations: A Big Data Algorithm Combining Capsule Networks and Attention Mechanisms

Register to see more suggestions

Cite

Readers' Discipline