Semi-structured Information Retrieval (SIR) allows the user to narrow his search down to the element level. As queries and XML documents can be seen as hierarchically nested elements, we consider that their structural proximity can be evaluated through their trees similarity. Our approach combines both content and structure scores, the latter being based on tree edit distance (minimal cost of operations to turn one tree to another). We use the tree structure to propagate and combine both measures. Moreover, to overcome time and space complexity, we summarize the document tree structure. We experimented various tree summary techniques as well as our original model using the SSCAS task of the INEX 2005 campaign. Results showed that our approach outperforms state of the art ones. © 2011 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Laitang, C., Boughanem, M., & Pinel-Sauvagnat, K. (2011). XML information retrieval through tree edit distance and structural summaries. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7097 LNCS, pp. 73–83). https://doi.org/10.1007/978-3-642-25631-8_7
Mendeley helps you to discover research relevant for your work.