PyArabic: A Python package for Arabic text

  • Zerrouki T
N/ACitations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Because text is the most common type of information representation, text processing and manipulation require recurring routines and functions. Every day, massive amounts of text are processed. Indeed, with the advent of artificial intelligence and new machine learning and deep learning enhancements, natural language processing has become a critical domain. PyArabic is a collection of modules that provide basic functionality for manipulating Arabic texts, phrases, words, numbers, and letters. It primarily provides preprocessing tools such as normalization, tokenization, diacritics removal, number conversion, transliteration, and so on. For years, researchers and developers who worked on machine learning algorithms for natural language processing have used the library for Arabic text preprocessing and cleaning. The library becomes more important for machine learning.

Cite

CITATION STYLE

APA

Zerrouki, T. (2023). PyArabic: A Python package for Arabic text. Journal of Open Source Software, 8(84), 4886. https://doi.org/10.21105/joss.04886

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free