View the explosion of data volume and high circulating on the web (satellite data, genomic data ...) the classification of the data (data mining technique) is required. The clustering was performed by a method based bio (social spiders) because there is currently no method of learning that can almost directly represent unstructured data (text). Thus, to make a good data classification must be a good representation of the data. The representation of these data is performed by a vector whose components are derived from the overall weight of the corpus used (TF-IDF). A language-independent method was used to represent text documents is that of n-grams characters and words. Several similarity measures have been tested. To validate the classification we used a measure of assessment based on recall and precision (f-measure). © 2012 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Hamou, R. M., Amine, A., & Rahmani, M. (2012). A new biomimetic approach based on social spiders for clustering of text. In Studies in Computational Intelligence (Vol. 430, pp. 17–30). https://doi.org/10.1007/978-3-642-30460-6_2
Mendeley helps you to discover research relevant for your work.