Research on Chinese Animal Words Extraction Based on Children’s Literature Corpus

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Categorized and graded vocabularies are an important aspect of children’s graded reading. Taking animal words from the Thesaurus of Modern Chinese as the seed words, this paper studies a method of extracting animal words from the children’s literature corpus and attempts to construct a word sequencing model. The method used is to match the results of automatic word segmentation with the seed words. There are 786 animal nouns extracted from the corpus, with an increasing rate of 39.36% compared to the 564 seed words, and there are 780 derivative animal words. The animal word sequencing model is based on word-work-popularity and word-writer-popularity, which resolves the problem of having an unbalanced number of characters and writer’s works.

Cite

CITATION STYLE

APA

Zhao, H., Wang, Z., Wang, S., & Zhang, L. (2020). Research on Chinese Animal Words Extraction Based on Children’s Literature Corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11831 LNAI, pp. 618–627). Springer. https://doi.org/10.1007/978-3-030-38189-9_63

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free