A sequence labeling approach to deriving word variants

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

This paper describes a learning-based approach for automatic derivation of word variant forms by the suffixation process. We employ the sequence labeling technique, which entails learning when to preserve, delete, substitute, or add a letter to form a new word from a given word. The features used by the learner are based on characters, phonetics, and hyphenation positions of the given word. To ensure that our system is robust to word variants that can arise from different forms of a root word, we generate multiple variant hypothesis for each word based on the sequence labeler's prediction. We then filter out ill-formed predictions, and create clusters of word variants by merging together a word and its predicted variants with other words and their predicted variants provided the groups share a word in common. Our results show that this learning-based approach is feasible for the task and warrants further exploration.

Cite

CITATION STYLE

APA

D’Souza, J. (2015). A sequence labeling approach to deriving word variants. In Proceedings of the National Conference on Artificial Intelligence (Vol. 6, pp. 4152–4153). AI Access Foundation. https://doi.org/10.1609/aaai.v29i1.9745

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free