Arbitrariness of the sign-the notion that the forms of words are unrelated to their meanings-is an underlying assumption of many linguistic theories. Two lines of research have recently challenged this assumption, but they produce differing characterizations of non-arbitrariness in language. Behavioral and corpus studies have confirmed the validity of localized form-meaning patterns manifested in limited subsets of the lexicon. Meanwhile, global (lexicon-wide) statistical analyses instead find diffuse form-meaning system-aticity across the lexicon as a whole. We bridge the gap with an approach that can detect both local and global form-meaning systematicity in language. In the kernel regression formulation we introduce, form-meaning relationships can be used to predict words' distributional semantic vectors from their forms. Furthermore, we introduce a novel metric learning algorithm that can learn weighted edit distances that minimize kernel regression error. Our results suggest that the English lexicon exhibits far more global form-meaning systematicity than previously discovered, and that much of this systematicity is focused in localized form-meaning patterns.
CITATION STYLE
Gutiérrez, E. D., Levy, R., & Bergen, B. K. (2016). Finding non-arbitrary form-meaning systematicity using string-metric learning for kernel regression. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers (Vol. 4, pp. 2379–2388). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-1225
Mendeley helps you to discover research relevant for your work.