Design of a Daily Brief Business Report Generator based on Web Scraping with KNN Algorithm

3Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In order to generate a report for an enterprise where there is neither the API supporting from their existing website systems nor the granted database access rights approval, a daily business report generator system based on web scraping with k nearest neighbor (kNN) classification algorithm is proposed in this paper. It covers the web crawler technology that is to access existing website system and extract business data. The kNN algorithm is applied to identify the verification code on the login page, and the brief daily report generating in a spread-sheet style grid. Compared with some OCR engine for image recognition, the system in Python can automatically generate the brief daily business reports by the kNN algorithm, which is better than some library with default training set on validating the verification code.

Cite

CITATION STYLE

APA

Lin, G., Liang, Y., Fu, X., Chen, G., & Cai, S. (2019). Design of a Daily Brief Business Report Generator based on Web Scraping with KNN Algorithm. In Journal of Physics: Conference Series (Vol. 1345). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1345/5/052064

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free