With Internet growing exponentially, topic-specific web crawler is becoming more and more popular in the web data mining. How to order the unvisited URLs was studied deeply, we present the notion of concept similarity context graph, and propose a novel approach to topic-specific web crawler, which calculates the unvisited URLs' prediction score by concepts' similarity in Formal Concept Analysis (FCA), while improving the retrieval precision and recall ratio. We firstly build a concept lattice using the visited pages, extract the core concepts which reflect the user's query topic from the concept lattice, and then construct our concept similarity context graph based on the semantic similarities between the core concepts and other concepts. © 2008 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Yang, Y., Du, Y., Sun, J., & Hai, Y. (2008). A topic-specific web crawler with concept similarity context graph based on FCA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5227 LNAI, pp. 840–847). https://doi.org/10.1007/978-3-540-85984-0_101
Mendeley helps you to discover research relevant for your work.