Historical text presents numerous challenges for contemporary natural language processing techniques. In particular, the absence of consistent orthographic conventions in historical text presents difficulties for any system requiring reference to a fixed lexicon accessed by orthographic form, such as information retrieval systems, part-of-speech taggers, simple word stemmers, or more sophisticated morphological analyzers.
CITATION STYLE
Jurish, B. (2010). More than Words: Using Token Context to Improve Canonicalization of Historical German. Journal for Language Technology and Computational Linguistics, 25(1), 23–39. https://doi.org/10.21248/jlcl.25.2010.127
Mendeley helps you to discover research relevant for your work.