Machine learning approach for homepage finding task

12Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes new machine learning approaches to predict the correct homepage in response to a user’s homepage finding query. This involves two phases. In the first phase, a decision tree is generated to predict whether a URL is a homepage URL or not. The decision tree then is used to filter out non-homepages from the web pages returned by a standard vector space information retrieval system. In the second phase, a logistic regression analysis is used to combine multiple sources of evidence based on the homepages remaining from the first step to predict which homepage is most relevant to a user’s query. 100 queries are used to train the logistic regression model and another 145 testing queries are used to evaluate the model derived. Our results show that about 84% of the testing queries had the correct homepage returned within the top 10 pages. This shows that our machine learning approaches are effective since without any machine learning approaches, only 59% of the testing queries had their correct answers returned within the top 10 hits.

Cite

CITATION STYLE

APA

Xi, W., Fox, E. A., Tan, R. P., & Shu, J. (2002). Machine learning approach for homepage finding task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2476, pp. 145–159). Springer Verlag. https://doi.org/10.1007/3-540-45735-6_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free