Integrating content and structure learning: A model of hypertext zoning and sounding

Alexander Mehler; Ulli Waltinger

Journal Article

Integrating content and structure learning: A model of hypertext zoning and sounding

Studies in Computational Intelligence (2011) 370 299-329

DOI: 10.1007/978-3-642-22613-7_15

0Citations

6Readers

Get full text

Abstract

The bag-of-words model is accepted as the first choice when it comes to representing the content of web documents. It benefits from a low time complexity, but this comes at the cost of ignoring document structure. Obviously, there is a trade-off between the range of document modeling and its computational complexity. In this chapter, we present a model of content and structure learning that tackles this trade-off with a focus on delimiting documents as instances of webgenres. We present and evaluate a two-level algorithm of hypertext zoning that integrates the genre-related classification of web documents with their segmentation. In addition, we present an algorithm of hypertext sounding with respect to the thematic demarcation of web documents. © 2011 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Mehler, A., & Waltinger, U. (2011). Integrating content and structure learning: A model of hypertext zoning and sounding. Studies in Computational Intelligence, 370, 299–329. https://doi.org/10.1007/978-3-642-22613-7_15

Integrating content and structure learning: A model of hypertext zoning and sounding

Abstract

Cite

Register to see more suggestions