This paper presents an information extraction system that processes the textual content of classified newspaper advertisements in French. The system uses both lexical (words, regular expressions)an d contextual information to structure the content of the ads on the basis of predefined thematic forms. The paper first describes the enhanced tagging mechanism used for extraction. A quantitative evaluation of the system is then provided: scores of 99.0% precision/99.8% recall for domain identification and 73% accuracy for information extraction were achieved, on the basis of a comparison with human annotators.
CITATION STYLE
Peleato, R. A., Chappelier, J. C., & Rajman, M. (2001). Automated information extraction out of classified advertisements. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1959, pp. 203–214). Springer Verlag. https://doi.org/10.1007/3-540-45399-7_17
Mendeley helps you to discover research relevant for your work.