Privacy-Preserving Text Labelling Through Crowdsourcing

Giannis Haralabopoulos; Mercedes Torres Torres; Ioannis Anagnostopoulos; Derek McAuley

Conference Proceedings

Privacy-Preserving Text Labelling Through Crowdsourcing

IFIP Advances in Information and Communication Technology (2021) 628 431-445

DOI: 10.1007/978-3-030-79157-5_35

3Citations

5Readers

Get full text

Abstract

The extensive use of online social media has highlighted the importance of privacy in the digital space. As more scientists analyse the data created in these platforms, privacy concerns have extended to data usage within the academia. Although text analysis is a well documented topic in academic literature with a multitude of applications, ensuring privacy of user-generated content has been overlooked. In an effort to reduce the exposure of online users’ information, we propose a privacy-preserving text labelling method for varying applications, based in crowdsourcing. We transform text with different levels of privacy and analyse the effectiveness of the transformation with regards to label correlation. To demonstrate the adaptive nature of our approach we also employ a TF/IDF filtering transformation. Our results suggest that total privacy can be implemented in labelling, retaining the annotational diversity and subjectivity of traditional labelling. The privacy-preserving labelling, with the use of NRC lexicon, demonstrates an average 0.11 Mean Spearman’s Rho correlation, boosted to 0.124 with TF/IDF filtering.

Author supplied keywords

Cite

CITATION STYLE

APA

Haralabopoulos, G., Torres, M. T., Anagnostopoulos, I., & McAuley, D. (2021). Privacy-Preserving Text Labelling Through Crowdsourcing. In IFIP Advances in Information and Communication Technology (Vol. 628, pp. 431–445). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-79157-5_35

Privacy-Preserving Text Labelling Through Crowdsourcing

Abstract

Author supplied keywords

Cite

Register to see more suggestions