In this paper, we study the problem of clustering textual units in the framework of helping an expert to build a specialized ontology. This work has been achieved in the context of a French project, called BIOTIM, handling botany corpora. Building an ontology, either automatically or semi-automatically is a difficult task. We focus on one of the main steps of that process, namely structuring the textual units occurring in the texts into classes, likely to represent concepts of the domain. The approach that we propose relies on the definition of a new non-symmetrical measure for evaluating the semantic proximity between lemma, taking into account the contexts in which they occur in the documents. Moreover, we present a non-supervised classification algorithm designed for the task at hand and that kind of data. The first experiments performed on botanical data have given relevant results. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Cleuziou, G., Billot, S., Lew, S., Martin, L., & Vrain, C. (2006). A proximity measure and a clustering method for concept extraction in an ontology building perspective. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4203 LNAI, pp. 697–706). Springer Verlag. https://doi.org/10.1007/11875604_77
Mendeley helps you to discover research relevant for your work.