Abstract
People illustrate the world, convey stories, share ideas, and interconnect in over 6900 languages. Information on the Internet may appear unlimited. All over history, electrical and computer experts have built tools such as telephone, telegraph and internet router, which have helped people communicate. Computer software that can translate between languages stands for one of such tools. The first step of translating a text is to categorize its language. In this research, self-identification program of text language was designed and tested depending on text letters (frequency, self-information, and entropy of certain chosen letters) for the English, French and German languages. The research, trying to detect the original language, is successful of detecting these languages, after applied to randomly selected text files. The detection program was written using C++ programming language.
Cite
CITATION STYLE
Abbas, R. H., & Kareem, F. A. E. A. (2019). Text Language Identification Using Letters (Frequency, Self-information, and Entropy) Analysis for English, French, and German Languages. Journal of Southwest Jiaotong University, 54(4). https://doi.org/10.35741/issn.0258-2724.54.4.21
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.