Neural based approach to keyword extraction from documents

11Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Documents are unstructured data consisting of natural language. Document surrogate means the structured data converted from original documents to process them in computer systems. Document surrogate is usually represented into a list of words. Because not all words in a document reflect its content, it is necessary to select important words related with its content among them. Such important words are called keywords and they are selected with a particular equation based on TF (Term Frequency) and IDF (inverted Document Frequency). Actually, not only TF and IDF but also the position of each word in the document and the inclusion of the word in the title should be considered to select keywords among words contained in the text The equation based on these factors gets too complicate to be applied to the selection of keywords. This paper proposes the neural network model, back propagation, in which these factors are used as the features and feature vectors are generated, and with which keywords are selected. This paper will show that back-propagation outperforms the equation in distinguishing keywords. © Springer-Verlag Berlin Heidelberg 2003.

Cite

CITATION STYLE

APA

Jo, T. (2003). Neural based approach to keyword extraction from documents. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2667, 456–461. https://doi.org/10.1007/3-540-44839-x_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free