Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking

6Citations
Citations of this article
42Readers
Mendeley users who have this article in their library.

Abstract

Online escort advertisement websites are widely used for advertising victims of human trafficking. Domain experts agree that advertising multiple people in the same ad is a strong indicator of trafficking. Thus, extracting person names from the text of these ads can provide valuable clues for further analysis. However, Named-Entity Recognition (NER) on escort ads is challenging because the text can be noisy, colloquial and often lacking proper grammar and punctuation. Most existing state-of-the-art NER models fail to demonstrate satisfactory performance in this task. In this paper, we propose NEAT (Name Extraction Against Trafficking) for extracting person names. It effectively combines classic rule-based and dictionary extractors with a contextualized language model to capture ambiguous names (e.g penny, hazel) and adapts to adversarial changes in the text by expanding its dictionary. NEAT shows 19% improvement on average in the F1 classification score for name extraction compared to previous state-of-the-art in two domain-specific datasets.

Cite

CITATION STYLE

APA

Li, Y., Nair, P., Pelrine, K., & Rabbany, R. (2022). Extracting Person Names from User Generated Text: Named-Entity Recognition for Combating Human Trafficking. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 2854–2868). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-acl.225

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free