A Semantics Enabled Intelligent Semi-structured Document Processor

3Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recent years, the amount of semi-structured documents available electrically has increased dramatically. Semi-structured documents usually are difficult to reuse due to the lack of explicit metadata. To enable integration and retrieval over semi-structured documents, the essential aspects in the documents should be described by metadata explicitly. The metadata could be assigned to documents and present part of their information content using various IE techniques. This paper also provides flexible user interaction mechanism to achieve better performance over less training sample documents. In semantic view extraction, by using similarity based rule induction, we have been able to improve the rule learning procedure. Experimental results show that our approach can significantly outperform most of the existing wrapper methods. We make use of the semantics that resides in document logical structure to help find relations between semantic entities. After semantic annotations of the documents, TIPSI allows those to be indexed with respect to the extracted text entities. To answer the query, TIPSI applies semantic restrictions over the entities in the KB. © Springer-Verlag Berlin Heidelberg 2014.

Cite

CITATION STYLE

APA

Zhang, K., Li, J. Z., Hong, M. C., Yan, X. D., & Song, Q. (2014). A Semantics Enabled Intelligent Semi-structured Document Processor. In Communications in Computer and Information Science (Vol. 426 CCIS, pp. 328–344). Springer Verlag. https://doi.org/10.1007/978-3-662-43908-1_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free