In this paper we present an approach for ontology population based on heterogeneous documents describing commercial products with various descriptions and diverse styles. The originality is the generation and progressive refinement of semantic annotations leading to identify the types of the products and their features whereas the initial information is very poor quality. Documents are annotated using an ontology. The annotation process is based on an initial set of known instances, this set being built from terminological elements added in the ontology. Our approach first uses semi-automated annotation techniques on a small dataset and then applies machine learning techniques in order to fully annotate the entire dataset. This work was motivated by specific application needs. Experimentations were conducted on real-world datasets in the toys domain.
CITATION STYLE
Alec, C., Reynaud-Delaître, C., Safar, B., Sellami, Z., & Berdugo, U. (2014). Automatic ontology population from product catalogs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8876, pp. 1–12). Springer Verlag. https://doi.org/10.1007/978-3-319-13704-9_1
Mendeley helps you to discover research relevant for your work.