This paper describes a newmethod to determine characteristic terms from texts by weighting them using extended PageRank calculations. Additionally, this method clusters found semantic term relations to assign each term a level of specifity to be able to distinguish between general and specific terms. This way, it is also possible to differentiate between terms of different semantic orientations in the same specifity level. In the experiments, it is shown which terms can be used for the automatic retrieval of semantically similar documents from large corpora like the World Wide Web through automatic query formulation. The selection of query terms of a different specifity level is also a useful instrument in interactive document retrieval to express the intended similarity of documents to be found. An added advantage of this method is, that it does not rely on third-party datasets and works on single texts.
CITATION STYLE
Kubek, M., & Unger, H. (2011). Search word extraction using extended pagerank calculations. Studies in Computational Intelligence, 391, 325–337. https://doi.org/10.1007/978-3-642-24806-1_25
Mendeley helps you to discover research relevant for your work.