Currently, the main drawback for the development of the Semantic Web stems from the manual tagging of web pages according to a given ontology that conceptualizes its domain. This tasks is usually hard, even for experts, and it is prone to errors due to the different interpretations users can have about the same documents. In this paper we address the problem of automatically generating ontology instances starting from a collection of unstructured documents (e.g. plain texts, HTML pages, etc.). These instances will populate the Semantic Web that is described by the ontology. The proposed approach combines Information Extraction techniques, mainly entity recognition, information merging and Text Mining techniques. This approach has been successfully applied in the development of a Semantic Web for the Archaeology Research. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Danger, R., Berlanga, R., & Ruíz-Shulcloper, J. (2004). CRISOL: An approach for automatically populating semantic web from unstructured text collections. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3180, 243–252. https://doi.org/10.1007/978-3-540-30075-5_24
Mendeley helps you to discover research relevant for your work.