Abstract
The traditional semantic interpretation of oracle bone characters, the earliest known system of Chinese writing, has relied heavily on expert-driven manual analysis. Therefore, artificial intelligence (AI)-based approaches have been increasingly explored for deciphering these characters. A fundamental prerequisite for AI-driven semantic inference is construction of a high-quality evolutionary dataset. In this study, a graph-based evolutionary dataset is introduced, encompassing 756 groups and 3780 Chinese characters across five historical stages. Unlike existing datasets that primarily represent characters as images, the proposed dataset employs a graph-based representation, wherein nodes correspond to key structural points of a character, and edges define their spatial relationships. Experimental analyses demonstrate that graph representations offer superior capabilities in capturing the structural stability of characters across evolutionary stages compared to image-based representations. The dataset is expected to serve as a valuable resource for the application of AI-driven methodologies in the decipherment of unknown oracle bone characters.
Cite
CITATION STYLE
Jiao, Q., Wu, J., Liu, Q., Zhang, H., Zhang, Z., Li, B., … Liu, Y. (2025). A graph-based evolutionary dataset for oracle bone characters from inscriptions to modern Chinese scripts. Npj Heritage Science, 13(1). https://doi.org/10.1038/s40494-025-01951-0
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.