An experience developing a semantic annotation system in a media group

Angel L. Garrido; Oscar Gómez; Sergio Ilarri; Eduardo Mena

Conference Proceedings

An experience developing a semantic annotation system in a media group

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7337 LNCS 333-338

DOI: 10.1007/978-3-642-31178-9_43

19Citations

7Readers

Get full text

Abstract

Nowadays media companies have difficulties for managing large amounts of news from agencies and self-made articles. Journalists and documentalists must face categorization tasks every day. There is also an additional trouble due to the usual large size of the list of words in a thesaurus, the typical tool used to tag news in the media. In this paper, we present a new method to tackle the problem of information extraction over a set of texts where the annotation must be composed by thesaurus elements. The method consists of applying lemmatization, obtaining keywords, and finally using a combination of Support Vector Machines (SVM), ontologies and heuristics to deduce appropriate tags for the annotation. We have evaluated it with a real set of changing news and we compared our tagging with the annotation performed by a real documentation department, obtaining very good results. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Garrido, A. L., Gómez, O., Ilarri, S., & Mena, E. (2012). An experience developing a semantic annotation system in a media group. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7337 LNCS, pp. 333–338). https://doi.org/10.1007/978-3-642-31178-9_43

An experience developing a semantic annotation system in a media group

Abstract

Author supplied keywords

Cite

Register to see more suggestions