Rich style embedding for intrinsic plagiarism detection

Oumaima Hourrane; El Habib Benlahmer

Journal ArticleOPEN ACCESS

Rich style embedding for intrinsic plagiarism detection

International Journal of Advanced Computer Science and Applications (2019) 10(11) 646-651

DOI: 10.14569/IJACSA.2019.0101185

3Citations

11Readers

Abstract

Stylometry plays an important role in the intrinsic plagiarism detection, where the goal is to identify potential plagiarism by analyzing a document involving undeclared changes in writing style. The purpose of this paper is to study the interaction between syntactic structures, attention mechanism, and contextualized word embeddings, as well as their effectiveness on plagiarism detection. Accordingly, we propose a new style embedding that combines syntactic trees and the pre-trained Multi-Task Deep Neural Network (MT-DNN). Additionally, we use attention mechanisms to sum the embeddings, thereby experimenting with both a Bidirectional Long Short-Term Memory (BiLSTM) and a Convolutional Neural Network (CNN) maxpooling for sentences encoding. Our model is evaluated on two sub-task; style change detection and style breach detection, and compared with two baseline detectors based on classic stylometric features.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Hourrane, O., & Benlahmer, E. H. (2019). Rich style embedding for intrinsic plagiarism detection. International Journal of Advanced Computer Science and Applications, 10(11), 646–651. https://doi.org/10.14569/IJACSA.2019.0101185

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 3

75%

Researcher 1

25%

Readers' Discipline

Computer Science 3

50%

Engineering 3

50%

Rich style embedding for intrinsic plagiarism detection

Abstract

Author supplied keywords

References Powered by Scopus

GloVe: Global vectors for word representation

A survey of modern authorship attribution methods

Intrinsic plagiarism analysis

Cited by Powered by Scopus

A system for educational and vocational guidance in Morocco: Chatbot e-orientation

Exploring the Landscape of Intrinsic Plagiarism Detection: Benchmarks, Techniques, Evolution, and Challenges

Topic-Transformer for Document-Level Language Understanding

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline