Knowledge Inference Model of OCR Conversion Error Rules Based on Chinese Character Construction Attributes Knowledge Graph

1Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

OCR is a character conversion method based on image recognition. The complexity of the character and the image quality plays a key role in the conversion accuracy. The OCR conversion process has the characteristics of irregular conversion errors and the combination between incorrect conversion words and context of original location in certain text scenarios is established in semantic. In this paper, we propose an OCR conversion error rules inference model based on Chinese character construction attribute knowledge graph to analyze and inference the structure and complexity of Chinese characters. The model integrates a variety of coding methods, extracts features of entities and relationships of different data types with different encoder in the knowledge graph, uses convolutional neural networks to learn and inference the unknown error rules in the OCR conversion. In addition, in order to enable the triple feature matrix to fully contain the construction attribute information of the Chinese characters, a feature crossover algorithm for feature diffusion of the triple feature matrix is introduced. In this algorithm, the relation matrix and the entities matrix are crossed to generate the new feature matrix which can better represent the triple of knowledge graph. The experimental results show that, compared with the current mainstream knowledge inference model, the OCR conversion error rules inference model incorporating the feature cross algorithm has achieved important improvements in MRR, Hits@1, Hits@2 and other evaluation indicators on public data sets and task-related data sets.

Cite

CITATION STYLE

APA

Zhang, X., Wang, H., & Gu, W. (2020). Knowledge Inference Model of OCR Conversion Error Rules Based on Chinese Character Construction Attributes Knowledge Graph. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12431 LNAI, pp. 415–425). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60457-8_34

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free