In this paper we describe information extraction from web pages of scientific conferences. We enrich already known features with our new features specific for this domain and show their importance in the process of extracting information. Moreover, we investigate various data representation models, e.g., based on single tokens or sequences, in order to find the best configuration for the task in question and set up a new baseline over publicly available corpus.
CITATION STYLE
Andruszkiewicz, P., & Hazan, R. (2018). Domain specific features driven information extraction from web pages of scientific conferences. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10761 LNCS, pp. 405–417). Springer Verlag. https://doi.org/10.1007/978-3-319-77113-7_32
Mendeley helps you to discover research relevant for your work.