Document Parsing Tool for Language Translation and Web Crawling using Django REST Framework

4Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

There are 7.5 billion inhabitants and over 7,117 languages existing around the world, but only 20% of the people speak English. To understand the wisdom and knowledge of other cultures language translation becomes a basic need. In this paper, a computer-assisted document parsing tool is investigated. The proposed approach uses a language translator that performs translation from images eliminating the need of a human translator for images avoiding the scope for misinterpretation and misunderstanding among people of different ethnic groups. The proposed tool is also capable of performing web crawling using Django Representational State Transfer framework. Further, the proposed approach employs Python packages such as pytesseract, textblob and beautifulsoup to perform Optical Character Recognition, Translation and Extraction of Hypertext Markup Language data respectively. Experimental results of translation on four different categories of images such as Maps, Comics, Newspapers and Magazines, Scientific Publications demonstrate an accuracy of 97.2%, 93.3%, 95.82% and 98.27% respectively. By considering websites like E-commerce, Magazines, Blogs, Social Media, News and Educational sites average precision of 5.4, recall of 7.45 and F-score of 6.24 is achieved. The results reveal that the proposed system can be used as an improvement over a human translator and a data entry operator.

Cite

CITATION STYLE

APA

Alnavar, K., Kumar, R. U., & Babu, C. N. (2021). Document Parsing Tool for Language Translation and Web Crawling using Django REST Framework. In Journal of Physics: Conference Series (Vol. 1962). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1962/1/012018

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free