Nowadays, searching for a topic on the Internet can be a frustrating experience because of all the excessive information. Thus, a strategy for automatically classifying the results can improve user experience and work efficiency. Latent Semantic Indexing (LSI) algorithm is used to classify documents by meaning due to its effectiveness. However, there is a problem with the implementation of this algorithm. LSI is computationally intensive because the cost is directly related to the number of documents. In particular, the Singular Value Decomposition (SVD) that is mainly used in LSI is unscalable in terms of both memory and computation time. One possible solution is to use more powerful computational resources, such as multiple computing nodes. In this paper, a novel distributed architecture for the LSI algorithm is proposed. It is based on the use of microservices in a Google Cloud environment. We evaluated the performances of the proposed Cloud-based LSI, and comparison is made with standalone LSI. The results show the benefits of using distributed systems based on runtime, concurrency, and processing.
Proaño, J., Reinoso, A., & Juma, J. (2020). Latent Semantic Index: A Microservices Architecture. In Communications in Computer and Information Science (Vol. 1154 CCIS, pp. 142–153). Springer. https://doi.org/10.1007/978-3-030-46785-2_12