Standard relevance feedback procedures are considered by Chen et al. as a benchmark for testing three machine learning approaches to retrieval: ID3, which, using a training set of relevant and non-relevant documents, computes an entropy value for each attribute and then builds a tree where the attribute determining each node is chosen based upon its ability to reduce entropy.Genetic algorithms, which in this adaption considered documents whose fitness was determined by a similarity measure with the training set that determined the probability of selection of a new population with the addition of mutation operators. Simulated annealing, a modification of the GA technique, useing a random mutation point applied to each document to generate a new candidate configuration, and accepting those where the similarity measure increased. All three algorithms improve similarity scores, but not significantly better than does relevance feedback. When more complex queries with larger answer sets are considered alone the more sophisticated algorithms perform significantly better than relevance feedback. In a small search experiment, genetic algorithms outperformed other methods in both precision and simulated recall. From a review of terms chosen it appears that relevance feedback did not identify the most crucial concepts, ID3 over generalized, and that the other two methods struck a good balance between these extremes.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below