The widespread use of deception in written content has motivated the need for methods to automatically profile and identify deceivers. Particularly, the identification of deception based on demographic data such as gender, age, and religion, has become of importance due to ethical and security concerns. Previous work on deception detection has studied the role of gender using statistical approaches and domain-specific data. This work explores gender detection in open domain truths and lies using a machine learning approach. First, we collect a deception dataset consisting of truths and lies from male and female participants. Second, we extract a large feature set consisting of n-grams, shallow and deep syntactic features, semantic features derived from a psycholinguistics lexicon, and features derived from readability metrics. Third, we build deception classifiers able to predict participant’s gender with classification accuracies ranging from 60-70%. In addition, we present an analysis of differences in the linguistic style used by deceivers given their reported gender.
CITATION STYLE
Pérez-Rosas, V., & Mihalcea, R. (2014). Gender differences in deceivers writing style. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8856, 163–174. https://doi.org/10.1007/978-3-319-13647-9_17
Mendeley helps you to discover research relevant for your work.