Semantic marking method for non-text documents of website based on their context in hypertext clustering

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Initial indexing and structuration of information on Internet are the conditions for resolving of the task of an effective search of information that best relates to user’s query now. Mainly they deal with text-based time expensive processing methods. Hyper structured nature of the web is used as an alternate approach for this purpose, but websites also contain information in the non-text format: (images, movies, pdf-files etc.). These documents, first of all, are intended for perception by the person, but not for the automated processing. In this article, we propose the method for the decision of this problem on the way of semantic marking of non-text documents based on their context in hypertext clustering. At the same time, we develop the approach of the context independent semantic clustering of the website with using of web-analytics information, which utilizes internal hypertext structure, user’s behavior statistics and does not require full-text content analysis. For this purpose, we represent the hypertext structure of the site as a graph and apply flow simulation algorithms to produce web clustering. Then we make a semantic description of the clusters by sets of keywords. Non-text documents have hyperlinks to some web clusters, so we consider extracted keywords for relating cluster as its semantic marking. We have checked the suggested method on the example of site sstu.ru.

Cite

CITATION STYLE

APA

Papshev, S., Sytnik, A., Melnikova, N., & Bogomolov, A. (2019). Semantic marking method for non-text documents of website based on their context in hypertext clustering. In Studies in Systems, Decision and Control (Vol. 199, pp. 313–323). Springer International Publishing. https://doi.org/10.1007/978-3-030-12072-6_26

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free