Many industries, such as oil, construction, banking, and insurance, have substantial historical physical data. Companies store this data in physical warehouses that are geographically distributed and usually taken care of by record management companies. Storing large volumes of historical physical data poses many critical challenges, such as increased maintenance cost, high time for recovery, and unsearchable data. Many companies digitize this data and consolidate this data into cloud repositories as part of their Digital Transformation (DT) journey to address these challenges. This DT process introduces many other technical challenges while dealing with poor scans, huge file size, geographically distributed files, and confidential documents. Though there are options to resolve each of these limitations individually, there are no frameworks that deal with digitization and historical data storage in its entirety. Moreover, they cannot handle a large number of documents having variable file sizes. This paper presents a generic cloud-based high-performance computing framework for knowledge extraction, comprising document classification based on neural networks and particle swarm optimization (PSO), data extraction, metadata enrichment, image enhancement using image processing (IP) techniques, and high data availability to users using cloud-based search. The proposed framework is executed on two cloud providers, i.e., Azure and AWS, to test its efficacy.
CITATION STYLE
Kanchibhotlaa, C., Venkatesh, P. R., Somayajulu, D. V. L. N., & Radhakrishna, P. (2021). A PSO based cloud framework for knowledge extraction. Journal of Engineering Research (Kuwait), 9, 17–25. https://doi.org/10.36909/jer.EMSME.13897
Mendeley helps you to discover research relevant for your work.