A supervised learning to rank approach for dependency based concept extraction and repository based boosting for domain text indexing

U. K. Naadan; T. V. Geetha; U. Kanimozhi; D. Manjula; R. Viswapriya; C. Karthik

Conference Proceedings

A supervised learning to rank approach for dependency based concept extraction and repository based boosting for domain text indexing

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 10859 LNCS 428-436

DOI: 10.1007/978-3-319-91947-8_44

1Citations

2Readers

Get full text

Abstract

In conventional information retrieval systems, keywords extracted from documents are indexed and used for retrieval. Since same information can be represented by different keywords, there is hindrance in extracting relevant documents. Concept based indexing and retrieval which semantically identifies similar documents overcomes this problem by mapping the document phrases to a domain repository. In this paper, the problem of extracting and ranking concepts i.e. key phrases, from domain oriented text is explored. This paper ranks concepts (key phrases) of a document based not only on statistical and cue phrases but also based on the dependency relations in which the candidate concept occurs. For each candidate a vector is formed with the phrase weight and the dependency relations. The features used to score the phrases in the vectors, for re-ranking and as features to weigh the vector corresponding to the candidate are the cue features (presence in title, abstract), C-value in case of multi-words, frequency of occurrence and the type of dependency relation. The ranking process utilizes RankingSVM to rank the candidate concepts based on the feature vectors. In addition, to make the ranking domain sensitive and to determine the domain relevance of the candidate concepts they are fully or partially matched with the domain repository. Based on the depth of the concept and the presence of parent and siblings, the domain relevant concepts are boosted up the order. The results indicate that the use of dependency based context vector and domain repository provides substantial enhancement in the key phrase extraction task compared with other methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Naadan, U. K., Geetha, T. V., Kanimozhi, U., Manjula, D., Viswapriya, R., & Karthik, C. (2018). A supervised learning to rank approach for dependency based concept extraction and repository based boosting for domain text indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10859 LNCS, pp. 428–436). Springer Verlag. https://doi.org/10.1007/978-3-319-91947-8_44

A supervised learning to rank approach for dependency based concept extraction and repository based boosting for domain text indexing

Abstract

Author supplied keywords

Cite

Register to see more suggestions