Research on Chinese Animal Words Extraction Based on Children’s Literature Corpus

Huizhou Zhao; Zhimin Wang; Shuning Wang; Lifan Zhang

Conference Proceedings

Research on Chinese Animal Words Extraction Based on Children’s Literature Corpus

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 11831 LNAI 618-627

DOI: 10.1007/978-3-030-38189-9_63

0Citations

1Readers

Get full text

Abstract

Categorized and graded vocabularies are an important aspect of children’s graded reading. Taking animal words from the Thesaurus of Modern Chinese as the seed words, this paper studies a method of extracting animal words from the children’s literature corpus and attempts to construct a word sequencing model. The method used is to match the results of automatic word segmentation with the seed words. There are 786 animal nouns extracted from the corpus, with an increasing rate of 39.36% compared to the 564 seed words, and there are 780 derivative animal words. The animal word sequencing model is based on word-work-popularity and word-writer-popularity, which resolves the problem of having an unbalanced number of characters and writer’s works.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhao, H., Wang, Z., Wang, S., & Zhang, L. (2020). Research on Chinese Animal Words Extraction Based on Children’s Literature Corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11831 LNAI, pp. 618–627). Springer. https://doi.org/10.1007/978-3-030-38189-9_63

Research on Chinese Animal Words Extraction Based on Children’s Literature Corpus

Abstract

Author supplied keywords

Cite

Register to see more suggestions