The semantic annotation of corpora has an important role to play in ensuring that sentences occurring in natural language texts are correctly understood based on their intended context. Two examples of lexical semantic units that contribute to this knowledge are word senses – which allow words with multiple meanings to be understood based on the context in which they are used – and named entities – which can be disambiguated and linked back to the specific encyclopedic resources that describe them. In this paper, we describe the construction of lexical semanticallyannotated corpora for Portuguese, annotated with both word senses linked to senses in a Portuguese wordnet and named entities linked to Portuguese Wikipedia entries using DBpedia. The result is a goldstandard lexical semantically-annotated resource that is useful in supporting the training and evaluation of tools for the disambiguation of these lexical units in Portuguese.
CITATION STYLE
Neale, S., Pereira, R. V., Silva, J., & Branco, A. (2016). Lexical semantics annotation for enriched Portuguese corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9727, pp. 296–305). Springer Verlag. https://doi.org/10.1007/978-3-319-41552-9_30
Mendeley helps you to discover research relevant for your work.