A novel focused crawler based on cell-like membrane computing optimization algorithm

  • Liu W
  • Du Y
  • 21

    Readers

    Mendeley users who have this article in their library.
  • 11

    Citations

    Citations of this article.

Abstract

In many research works, topical priorities of unvisited hyperlinks are computed based on linearly integrating topic-relevant similarities of various texts and corresponding weighted factors. However, these weighted factors are determined based on the personal experience, so that these values may make topical priorities of unvisited hyperlinks serious deviations directly. To solve this problem, this paper proposes a novel focused crawler applying the cell-like membrane computing optimization algorithm (CMCFC). The CMCFC regards all weighted factors corresponding to contribution degrees of similarities of various texts as one object, and utilizes evolution regulars and communication regulars in membranes to achieve the optimal object corresponding to the optimal weighted factors, which make the root measure square error (RMS) of priorities of hyperlinks achieve the minimum. Then, it linearly integrates optimal weighted factors and corresponding topical similarities of various texts, which are computed by using a Vector Space Model (VSM), to compute priorities of unvisited hyperlinks. The CMCFC obtains more accurate unvisited URLs' priorities to guide crawlers to collect higher quality web pages. The experimental results indicate that the proposed method improves the performance of focused crawlers by intelligently determining weighted factors. In conclusion, the mentioned approach is effective and significant for focused crawlers. © 2013 Elsevier B.V.

Author-supplied keywords

  • Focused crawler
  • Membrane computing
  • Optimization algorithm
  • VSM

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free