Information extraction using XPath

Masashi Okada; Naohiro Ishii; Ippei Torii

Conference Proceedings

Information extraction using XPath

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6278 LNAI(PART 3) 104-112

DOI: 10.1007/978-3-642-15393-8_13

1Citations

2Readers

Get full text

Abstract

To improve the classification accuracy of documents, it will be important to characterize not only words but also their relations among words. The classification method from this point of view will need another approach for the analysis of documents. In this paper, first, how to find the pattern tree in the XML data tree as the embedded sub-tree is developed simply by applying XPath technique. This problem is applicable to the search of the characterized words and their relations in the XML documents. Second, next problem is what kind of words and their relations exist in the XML documents. This problem is how to find the most frequent patterns in the documents, which is called often the most frequent sub-trees in the XML domain. The second problem finding the most frequent sub-trees is solved simply here by applying XPath technique. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Okada, M., Ishii, N., & Torii, I. (2010). Information extraction using XPath. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6278 LNAI, pp. 104–112). https://doi.org/10.1007/978-3-642-15393-8_13

Information extraction using XPath

Abstract

Cite

Register to see more suggestions