Abstract
Weighted finite-state automata (WSFAs) are commonly used in NLP. Failure transitions are a useful extension for compactly representing backoffs or interpolation in n-gram models and CRFs, which are special cases of WFSAs. The pathsum in ordinary acyclic WFSAs is efficiently computed by the backward algorithm in time O(∣E∣), where E is the set of transitions. However, this does not allow failure transitions, and preprocessing the WFSA to eliminate failure transitions could greatly increase ∣E∣. We extend the backward algorithm to handle failure transitions directly. Our approach is efficient when the average state has outgoing arcs for only a small fraction s ≪ 1 of the alphabet Σ. We propose an algorithm for general acyclic WFSAs which runs in O(∣E∣ + s∣Σ∣∣Q∣∣Tmax∣ log ∣Σ∣), where Q is the set of states and ∣Tmax∣ is the size of the largest connected component of failure transitions. When the failure transition topology satisfies a condition exemplified by CRFs, the ∣Tmax∣ factor can be dropped, and when the weight semiring is a ring, the log ∣Σ∣ factor can be dropped. In the latter case (ring-weighted acyclic WFSAs), we also give an alternative algorithm with complexity O(∣E∣ + ∣Σ∣∣Q∣ min(1, s∣πmax∣)), where ∣πmax∣ is the size of the longest failure path.
Cite
CITATION STYLE
Svete, A., Dayan, B., Vieira, T., Cotterell, R., & Eisner, J. (2022). Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 8289–8305). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.567
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.