Abstract
In this paper we present an approach to structure learning in the area of web documents. This is done in order to approach the goal of webgenre tagging in the area of web corpus linguistics. A central outcome of the paper is that purely structure oriented approaches to web document classification provide an information gain which may be utilized in combined approaches of web content and structure analysis.
Cite
CITATION STYLE
Gleim, R., Mehler, A., & Dehmer, M. (2006). Web Corpus Mining by instance of Wikipedia. In EACL 2006 - 11th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the 2nd International Workshop on Web as Corpus, WAC 2006 (pp. 67–74). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1628297.1628307
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.