Assisting european portuguese teaching: Linguistic features extraction and automatic readability classifier

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes two automatic systems: a linguistic features extractor and a text readability classifier for European Portuguese texts. Its main goal is to assist the selection of adequate reading materials to support Portuguese teaching, especially as a second language. To the feature extraction from texts, the system uses several Natural Language Processing (NLP) tools. Currently, 52 features are extracted: parts-ofspeech (POS), syllables, words, chunks and phrases, averages and frequencies, among others. A classifier was created using these features and a corpus, previously annotated readability level, adopting the five-levels language classification official standard for Portuguese as Second Language. In a five-levels (from A1 to C1) scenario, the best-performing learning algorithm (LogitBoost) achieved an accuracy of 75.11% with a root mean square error (RMSE) of 0.269. In a three-levels (A, B and C) scenario, the best-performing learning algorithm (C4.5 grafted) achieved 81.44% accuracy, with a RMSE of 0.346.

Cite

CITATION STYLE

APA

Curto, P., Mamede, N., & Baptista, J. (2016). Assisting european portuguese teaching: Linguistic features extraction and automatic readability classifier. In Communications in Computer and Information Science (Vol. 583, pp. 81–96). Springer Verlag. https://doi.org/10.1007/978-3-319-29585-5_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free