Cic-fbk approach to native language identification

20Citations
Citations of this article
71Readers
Mendeley users who have this article in their library.

Abstract

We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring.

Cite

CITATION STYLE

APA

Markov, I., Chen, L., Strapparava, C., & Sidorov, G. (2017). Cic-fbk approach to native language identification. In EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop (pp. 374–381). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-5042

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free