This paper presents a study made in a field poorly explored in the Portuguese language – modality and its automatic tagging. Our main goal was to find a set of attributes for the creation of automatic taggers with improved performance over the bag-of-words (bow) approach. The performance was measured using precision, recall and F:1. Because it is a relatively unexplored field, the study covers the creation of the corpus (composed by eleven verbs), the use of a parser to extract syntactic and semantic information from the sentences and a machine learning approach to identify modality values. Based on three different sets of attributes – from trigger itself and the trigger’s path (from the parse tree) and context – the system creates a tagger for each verb achieving (in almost every verb) an improvement in F:1 when compared to the traditional bow approach.
CITATION STYLE
Sequeira, J., Gonçalves, T., Quaresma, P., Mendes, A., & Hendrickx, I. (2018). Using syntactic and semantic features for classifying modal values in the portuguese language. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9624 LNCS, pp. 362–373). Springer Verlag. https://doi.org/10.1007/978-3-319-75487-1_28
Mendeley helps you to discover research relevant for your work.