Many early Japanese books record a large amount of information, including historical politics, economics, culture, and so on, which are all valuable legacies. These books are waiting to be reorganized at the moment. However, a large amount of the books are described by Kuzushiji, a type of handwriting cursive script that is no longer in use today and only readable by a few experts. Therefore, researchers are trying to detect and recognise the characters from these books through modern techniques. Unfortunately, the characteristics of the Kuzushiji, such as Connect-Separate-characters and Many-variation, hinder the modern technique assisted re-organisation. Connect-Separate-characters refer to the case of some characters connecting each other or one character being separated into unconnected parts, which makes character detection hard. Many-variation is one of the typical characteristics of Kuzushiji, defined as the case that the same character has several variations even if they are written by the same person in the same book at the same time, which increases the difficulty of character recognition. In this sense, this paper aims to construct an early Japanese book reorganisation system by combining image processing and deep learning techniques. The experimentation has been done by testing two early Japanese books. In terms of character detection, the final Recall, Precision and F-value reaches 79.8%, 80.3%, and 80.0%, respectively. The deep learning based character recognition accuracy of Top3 reaches 69.52%, and the highest recognition rate reaches 82.57%, which verifies the effectiveness of our proposal.
CITATION STYLE
Lyu, B., Li, H., Tanaka, A., & Meng, L. (2022). The early Japanese books reorganization by combining image processing and deep learning. CAAI Transactions on Intelligence Technology, 7(4), 627–643. https://doi.org/10.1049/cit2.12104
Mendeley helps you to discover research relevant for your work.