Latent semantic analysis evaluation of conceptual dependency driven focused crawling

4Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we study a focused crawler driven by deep semantic analysis provided by the Conceptual Dependency (CD) theory. We test in practice the application of CD scripts as an approach of defining topics (queries) in a focused crawler and its robustness in evaluating real text structures extracted from HTML documents. In order to benchmark its efficiency in comparison to classical approaches, apart from human evaluation we also provide an evaluation of the result set based on its internal similarity using Latent Semantic Analysis (LSA). The performed measurement brings us to the conclusion that the CD theory is well suited for evaluating the similarity of HTML documents provided a specific query, as it achieves a high precision measured through human evaluation. At the same time we observe the drawbacks of LSA used in the same context. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Dorosz, K., & Korzycki, M. (2012). Latent semantic analysis evaluation of conceptual dependency driven focused crawling. In Communications in Computer and Information Science (Vol. 287 CCIS, pp. 77–84). https://doi.org/10.1007/978-3-642-30721-8_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free