Mind your language: Abuse and offense detection for code-switched languages

12Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users. In this paper, we study offense and abuse detection in the code-switched pair of Hindi and English (i.e, Hinglish), the pair that is the most spoken. The task is made difficult due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish language. We apply transfer learning and make a LSTM based model for hate speech classification. This model surpasses the performance shown by the current best models to establish itself as the state-of-the-art in the unexplored domain of Hinglish offensive text classification. We also release our model and the embeddings trained for research purposes.

Cite

CITATION STYLE

APA

Kapoor, R., Shah, R. R., Kumar, Y., Kumaraguru, P., Rajput, K., & Zimmermann, R. (2019). Mind your language: Abuse and offense detection for code-switched languages. In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (pp. 9951–9952). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33019951

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free