Sistem Deteksi Bahasa pada Dokumen menggunakan N-Gram

  • Zaman B
  • Hariyanti E
  • Purwanti E
N/ACitations
Citations of this article
62Readers
Mendeley users who have this article in their library.

Abstract

Language detection on a very large collection of documents can be done to increasing performance of information retrieval system. One of popular method on language detection is N-Grams, based on pieces of n-characters taken from a string. This research is developed language detection system based on N-Gram that performs by Indonesian or English language. In general, the steps being taken there were 3 phases, namely creating profile of each language, system testing, and system evaluation. Fifty documents were used to creating profile of each language, i.e. 25 Indonesian and 25 English. Sixty documents were used for system testing. System performance was evaluated using F-measures. Based on the test, obtained F-measures for unigram, bigram, and unigram respectively 0.933, 0.917, and 0.933.

Cite

CITATION STYLE

APA

Zaman, B., Hariyanti, E., & Purwanti, E. (2015). Sistem Deteksi Bahasa pada Dokumen menggunakan N-Gram. MULTINETICS, 1(2), 21. https://doi.org/10.32722/vol1.no2.2015.pp21-26

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free