The traditional Vector Space Model (VSM) is not able to represent both the structure and the content of XML documents. This paper introduces a novel method of representing XML documents in a Tensor Space Model (TSM) and then utilizing it for clustering. Empirical analysis shows that the proposed method is scalable for large-sized datasets; as well, the factorized matrices produced from the proposed method help to improve the quality of clusters through the enriched document representation of both structure and content information. © 2011 Springer-Verlag.
CITATION STYLE
Kutty, S., Nayak, R., & Li, Y. (2011). XML documents clustering using a tensor space model. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6634 LNAI, pp. 488–499). Springer Verlag. https://doi.org/10.1007/978-3-642-20841-6_40
Mendeley helps you to discover research relevant for your work.