Extracting topic maps from web pages

Motohiro Mase; Seiji Yamada; Katsumi Nitta

Conference Proceedings

Extracting topic maps from web pages

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5433 LNAI 169-180

DOI: 10.1007/978-3-642-00399-8_15

0Citations

3Readers

Get full text

Abstract

We propose a framework to extract topic maps from a set of Web pages. We use the clustering method with the Web pages and extract the topic map prototypes. We introduced the following two points to the existing clustering method: The first is merging only the linked Web pages, thus extracting the underlying relationships between the topics. The second is introducing weighting based on similarity from the contents of the Web pages and relevance between topics of pages. The relevance is based on the types of links with directories in Web sites structure and the distance between the directories in which the pages are located. We generate the topic map prototypes from the results of the clustering. Finally, users complete the prototype by labeling the topics and associations and removing the unnecessary items. For this paper, at the first step, we mounted the proposed clustering method and extracted the prototype with the method. © Springer-Verlag Berlin Heidelberg 2009.

Author supplied keywords

Cite

CITATION STYLE

APA

Mase, M., Yamada, S., & Nitta, K. (2009). Extracting topic maps from web pages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5433 LNAI, pp. 169–180). Springer Verlag. https://doi.org/10.1007/978-3-642-00399-8_15

Extracting topic maps from web pages

Abstract

Author supplied keywords

Cite

Register to see more suggestions