Identification of authorship of ukrainianlanguage texts of journalistic style using neural networks

27Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

The problem of development of an effective method for text authorship identification (on the material of publications of well-known Ukrainian journalists) is explored. Most existing methods require text preprocessing, which entails new costs when solving the set problem. In the case where the number of possible authors can be minimized, this approach is often excessive. Another disadvantage of the existing approaches is that their vast majority was applied to texts in foreign languages and did not take into consideration the peculiarities of the Ukrainian language. Therefore, it was decided to develop an approach that makes it possible to identify the author of the text in Ukrainian without preprocessing and give high accuracy results, as well as to establish what types of artificial neural networks provide the minimum error for Ukrainian publicists. The developed method uses a multilayer perceptron of direct distribution, the algorithm of supervised learning, vectorization HashingVectorizer, and Adam optimizer. It was determined that with a small number of iterations (4–5 iterations) of artificial neural network learning, we obtain a rather high accuracy of identification of authorship of journalistic texts and rather small value of error. Over 1,000 fragments of texts by three Ukrainian authors were used. As a result of the conducted experiments, it was found that the application of the developed approach to solving the set problem enables achieving rather high results. In the texts containing not less than 500 characters, the accuracy reaches 91 %, and the maximum number of iterations of artificial neural network learning does not exceed 15. Such results were achieved primarily due to the efficient selection of the vectorization method at the preparatory stage and the structure of an artificial neural network

Cite

CITATION STYLE

APA

Lupei, M., Mitsa, A., Repariuk, V., & Sharkan, V. (2020). Identification of authorship of ukrainianlanguage texts of journalistic style using neural networks. Eastern-European Journal of Enterprise Technologies, 1(2–103), 30–36. https://doi.org/10.15587/1729-4061.2020.195041

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free