Multi-word terms pose many challenges in Natural Language Processing (NLP) because of their structure ambiguity. Although the structural disambiguation of multi-word expressions, also known as bracketing, has been widely studied, no definitive solution has as yet been found. Although linguists, terminologists, and translators must deal with bracketing problems, they generally must resolve problems without using advanced NLP systems. This paper describes a series of manual steps for the bracketing of multi-word terms (MWTs) based on their linguistic properties and recent advances in NLP. After analyzing 100 three- and four-term combinations, a set of criteria for MWT bracketing was devised and arranged in a step-by-step protocol based on frequency and reliability. Also presented is a case study that illustrates the procedure.
CITATION STYLE
Cabezas-García, M., & León-Araúz, P. (2019). On the structural disambiguation of multi-word terms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11755 LNAI, pp. 46–60). Springer. https://doi.org/10.1007/978-3-030-30135-4_4
Mendeley helps you to discover research relevant for your work.