A Semantics Enabled Intelligent Semi-structured Document Processor

Kuo Zhang; Juan Zi Li; Ming Cai Hong; Xue Dong Yan; Qiang Song

Conference Proceedings

A Semantics Enabled Intelligent Semi-structured Document Processor

Communications in Computer and Information Science (2014) 426 CCIS 328-344

DOI: 10.1007/978-3-662-43908-1_41

3Citations

4Readers

Get full text

Abstract

Recent years, the amount of semi-structured documents available electrically has increased dramatically. Semi-structured documents usually are difficult to reuse due to the lack of explicit metadata. To enable integration and retrieval over semi-structured documents, the essential aspects in the documents should be described by metadata explicitly. The metadata could be assigned to documents and present part of their information content using various IE techniques. This paper also provides flexible user interaction mechanism to achieve better performance over less training sample documents. In semantic view extraction, by using similarity based rule induction, we have been able to improve the rule learning procedure. Experimental results show that our approach can significantly outperform most of the existing wrapper methods. We make use of the semantics that resides in document logical structure to help find relations between semantic entities. After semantic annotations of the documents, TIPSI allows those to be indexed with respect to the extracted text entities. To answer the query, TIPSI applies semantic restrictions over the entities in the KB. © Springer-Verlag Berlin Heidelberg 2014.

Cite

CITATION STYLE

APA

Zhang, K., Li, J. Z., Hong, M. C., Yan, X. D., & Song, Q. (2014). A Semantics Enabled Intelligent Semi-structured Document Processor. In Communications in Computer and Information Science (Vol. 426 CCIS, pp. 328–344). Springer Verlag. https://doi.org/10.1007/978-3-662-43908-1_41

A Semantics Enabled Intelligent Semi-structured Document Processor

Abstract

Cite

Register to see more suggestions