The paper presents a novel methodology intended to distinguish between real and artificially generated manuscripts. The approach employs inherent differences between the human and artificially generated wring styles. Taking into account the nature of the generation process, we suggest that the human style is essentially more “diverse” and “rich” in comparison with an artificial one. In order to assess dissimilarities between fake and real papers, a distance between writing styles is evaluated via the dynamic dissimilarity methodology. From this standpoint, the generated papers are much similar in their own style and significantly differ from the human written documents. A set of fake documents is captured as the training data so that a real document is expected to appear as an outlier in relation to this collection. Thus, we analyze the proposed task in the context of the one-class classification using a one-class SVM approach compared with a clustering base procedure. The provided numerical experiments demonstrate very high ability of the proposed methodology to recognize artificially generated papers.
CITATION STYLE
Avros, R., & Volkovich, Z. (2018). Detection of computer-generated papers using one-class SVM and cluster approaches. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10935 LNAI, pp. 42–55). Springer Verlag. https://doi.org/10.1007/978-3-319-96133-0_4
Mendeley helps you to discover research relevant for your work.