TF-IDF schemes are popular for generating the feature vectors of documents. These schemes are proposed for characterizing one document. Therefore, in order to characterize Web pages using tf-idf schemes, the feature vectors of the Web pages should be reflected by the contents of Web pages linked with other pages via hyperlinks. In this paper, we propose three methods of generating feature vectors for linked documents such as Web pages. Moreover, in order to verify the effectiveness of our proposed methods, we compare our methods with current search engines and confirm their retrieval accuracy using recall precision curves.
CITATION STYLE
Sugiyama, K., Hatano, K., Yoshikawa, M., & Uemura, S. (2002). A method of improving feature vector for web pages reflecting the contents of their out-linked pages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2453, pp. 891–901). Springer Verlag. https://doi.org/10.1007/3-540-46146-9_88
Mendeley helps you to discover research relevant for your work.