A time series model of the writing process

Zeev Volkovich

Conference Proceedings

A time series model of the writing process

Volkovich Z

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9729 128-142

DOI: 10.1007/978-3-319-41920-6_10

3Citations

2Readers

Get full text

Abstract

The necessity to operate with the huge number of anonymous documents abounding on the Internet is initiating the study of new methods for authorship recognition. The principal weakness of the methods used in this area is that they assess the similarity of text styles without any regard to their surroundings. This paper proposes a novel mathematical model of the writing process striving to quantify this dependency. A text is divided into a series of sequential sub-documents, which are represented via term histograms. The histograms proximity is estimated through a simple probability distance. Intending to typify the text writing style, a new characteristic representing the mean distance between a current sub-document and numerous earlier ones is advanced. An empirical distribution over the whole document of this feature specifies the writing style. So, dissimilarity of such distributions indicates a difference in the writing styles, and their coincidence implies the styles’ identity. Numerical experiments demonstrate high potential ability of the proposed approach.

Author supplied keywords

Cite

CITATION STYLE

APA

Volkovich, Z. (2016). A time series model of the writing process. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9729, pp. 128–142). Springer Verlag. https://doi.org/10.1007/978-3-319-41920-6_10

A time series model of the writing process

Abstract

Author supplied keywords

Cite

Register to see more suggestions