Cic-fbk approach to native language identification

Ilia Markov; Lingzhen Chen; Carlo Strapparava; Grigori Sidorov

Conference ProceedingsOPEN ACCESS

Cic-fbk approach to native language identification

EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop (2017) 374-381

DOI: 10.18653/v1/w17-5042

20Citations

71Readers

Abstract

We present the CIC-FBK system, which took part in the Native Language Identification (NLI) Shared Task 2017. Our approach combines features commonly used in previous NLI research, i.e., word n-grams, lemma n-grams, part-of-speech n-grams, and function words, with recently introduced character n-grams from misspelled words, and features that are novel in this task, such as typed character n-grams, and syntactic n-grams of words and of syntactic relation tags. We use log-entropy weighting scheme and perform classification using the Support Vector Machines (SVM) algorithm. Our system achieved 0.8808 macro-averaged F1-score and shared the 1st rank in the NLI Shared Task 2017 scoring.

Cite

CITATION STYLE

APA

Markov, I., Chen, L., Strapparava, C., & Sidorov, G. (2017). Cic-fbk approach to native language identification. In EMNLP 2017 - 12th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2017 - Proceedings of the Workshop (pp. 374–381). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-5042

Cic-fbk approach to native language identification

Abstract

Cite

Register to see more suggestions