A layout-independent web news article contents extraction method based on relevance analysis

N/ACitations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The traditional Web news article contents extraction methods are time-costly and need much maintenance because they analyze the layout of news pages to generate the wrappers manually or automatically. In this paper, we propose a relevance-based analysis method to extract the news article contents from the news pages without the analysis of news page layouts before extraction. This method is applicable to the general news pages and we give the implementations of news extraction from different kinds of news sources. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Han, H., & Tokuda, T. (2009). A layout-independent web news article contents extraction method based on relevance analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5648 LNCS, pp. 453–460). https://doi.org/10.1007/978-3-642-02818-2_37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free