Conventional tools for automatic metadata creation mostly extract named entities or patterns from texts and annotate them with information about persons, locations, dates, and so on. However, this kind of entity type information is often too primitive for more advanced intelligent applications such as concept-based search. Here, we try to generate semantically-deep metadata with limited human intervention. The main idea behind our approach is to use Web mining and categorization techniques to create thematic metadata. The proposed approach, comprises of three computational modules: feature extraction, HCQF (hier-concept query formulation) and text instance categorization. The feature extraction module sends the name of text instances to Web search engines, and the returned highly-ranked search-result pages are used to describe them.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below