In the contemporary digital age, an intensified emphasis has been placed on the research of book recommendation systems. Historically, these systems predominantly focused on readers' past preferences, overlooking the inherent characteristics of the book's content and design. To address this gap, a novel algorithm, leveraging both multimodal image processing and deep learning, was designed. Features from book cover images were extracted using the VGG16 model, while textual attributes were discerned through a combination of the Word2Vec model and LSTM neural networks. The integration of the CBAM attention mechanism culminated in the creation of a modality-weighted feature fusion module, facilitating the dynamic allocation of feature weights. Furthermore, an objective function for this recommendation model was formulated, ensuring the enhancement of its performance during the training phase. This study not only presents a groundbreaking methodology to amplify the efficacy and resilience of book recommendation systems but also broadens understanding in the realm of multimodal information processing within deep learning-based recommendation platforms.
CITATION STYLE
Li, Y., Li, X., & Zhao, Q. (2023). Multimodal Deep Learning Framework for Book Recommendations: Harnessing Image Processing with VGG16 and Textual Analysis via LSTM-Enhanced Word2Vec. Traitement Du Signal, 40(4), 1367–1376. https://doi.org/10.18280/ts.400406
Mendeley helps you to discover research relevant for your work.