String similarity is most often measured by weighted or unweighted edit distance d(x, y). Ristad and Yianilos (1998) defined stochastic edit distance - a probability distribution p(y | x) whose parameters can be trained from data. We generalize this so that the probability of choosing each edit operation can depend on contextual features. We show how to construct and train a probabilistic finite-state transducer that computes our stochastic contextual edit distance. To illustrate the improvement from conditioning on context, we model typos found in social media text. © 2014 Association for Computational Linguistics.
CITATION STYLE
Cotterell, R., Peng, N., & Eisner, J. (2014). Stochastic contextual edit distance and probabilistic FSTs. In 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014 - Proceedings of the Conference (Vol. 2, pp. 625–630). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p14-2102
Mendeley helps you to discover research relevant for your work.