A natural and multi-layered approach to detect changes in tree-based textual documents

Angelo Di Iorio; Michele Schirinzi; Fabio Vitali; Carlo Marchetti

Conference Proceedings

A natural and multi-layered approach to detect changes in tree-based textual documents

Lecture Notes in Business Information Processing (2009) 24 LNBIP 90-101

DOI: 10.1007/978-3-642-01347-8_8

13Citations

7Readers

Get full text

Abstract

Several efficient and very powerful algorithms exist for detecting changes in tree-based textual documents, such as those encoded in XML. An important aspect is still underestimated in their design and implementation: the quality of the output, in terms of readability, clearness and accuracy for human users. Such requirement is particularly relevant when diff-ing literary documents, such as books, articles, reviews, acts, and so on. This paper introduces the concept of 'naturalness' in diff-ing tree-based textual documents, and discusses a new extensible set of changes which can and should be detected. A naturalness-based algorithm is presented, as well as its application for diff-ing XML-encoded legislative documents. The algorithm, called JNDiff, proved to detect significantly better matchings (since new operations are recognized) and to be very efficient. © 2009 Springer Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Di Iorio, A., Schirinzi, M., Vitali, F., & Marchetti, C. (2009). A natural and multi-layered approach to detect changes in tree-based textual documents. In Lecture Notes in Business Information Processing (Vol. 24 LNBIP, pp. 90–101). Springer Verlag. https://doi.org/10.1007/978-3-642-01347-8_8

A natural and multi-layered approach to detect changes in tree-based textual documents

Abstract

Author supplied keywords

Cite

Register to see more suggestions