Automatic text classification using an artificial neural network

7Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The increasing volume of available documents in the World Wide Web has turned the document indexing and searching more and more complex. This issue has motivated the development of several researches in the text classification area. However, the techniques resulting from these researches require human intervention to choose the more adequate parameters to carry on the classification. Motivated by such limitation, this article presents a new model for the text automatic classification 1. This model uses a self-organizing artificial neural network architecture, which does not require previous knowledge on the domains to be classified. The document features, in a word radical frequency format, are submitted to such architecture, what generates clusters with similar sets of documents. The model deals with stages of feature extraction, classification, labeling and indexing of documents for searching purposes. The classification stage, receives the radical frequency vectors, submit them to the ART-2A neural network that classifies them and stores the patterns in clusters, based on their similarity level. The labeling stage is responsible for extracting the significance level of each radical for each generated cluster. Such significances are used to index the documents, providing support to the next stage, which comprehends the document searching. The main contributions provided by the proposed model are: proposal for distance measures to automate the ρ vigilance parameter responsible for the classification quality, thus eliminating the need of human intervention on the parameterization process; proposal for a labeling algorithm that extracts the significance level of each word for each cluster generated by the neural network; and the creation of an automated classification methodology. © 2005 by International Federation for Information Processing.

Cite

CITATION STYLE

APA

de Mello, R. F., Senger, L. J., & Yang, L. T. (2005). Automatic text classification using an artificial neural network. In IFIP Advances in Information and Communication Technology (Vol. 172, pp. 215–238). Springer New York LLC. https://doi.org/10.1007/0-387-24049-7_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free