A framework for web table mining

  • Yang Y
  • Luk W
  • 14

    Readers

    Mendeley users who have this article in their library.
  • 19

    Citations

    Citations of this article.

Abstract

Web table mining is about information extraction from tables published inside web pages as HTML texts. Most previous work on this subject makes use of the tags to discover components of the table. Our work treats web as a distinct publication media, in two ways. We argue that new types of table format have been developed specially for the web. We also argue that the visual cues embedded within the HTML text, are utilized by the authors to direct the viewer on how to read the contents contained a web table properly. We develop a framework for comprehensively analyzing the structural aspects of a web table, within which rules are devised to process and extract attribute-value pairs from the table. This approach to web table mining is validated by good experimental results.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Yingchen Yang

  • Wo-Shun Luk

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free