A natural and multi-layered approach to detect changes in tree-based textual documents

13Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Several efficient and very powerful algorithms exist for detecting changes in tree-based textual documents, such as those encoded in XML. An important aspect is still underestimated in their design and implementation: the quality of the output, in terms of readability, clearness and accuracy for human users. Such requirement is particularly relevant when diff-ing literary documents, such as books, articles, reviews, acts, and so on. This paper introduces the concept of 'naturalness' in diff-ing tree-based textual documents, and discusses a new extensible set of changes which can and should be detected. A naturalness-based algorithm is presented, as well as its application for diff-ing XML-encoded legislative documents. The algorithm, called JNDiff, proved to detect significantly better matchings (since new operations are recognized) and to be very efficient. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Di Iorio, A., Schirinzi, M., Vitali, F., & Marchetti, C. (2009). A natural and multi-layered approach to detect changes in tree-based textual documents. In Lecture Notes in Business Information Processing (Vol. 24 LNBIP, pp. 90–101). Springer Verlag. https://doi.org/10.1007/978-3-642-01347-8_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free