This paper introduces an open-source Java-package called German Language Processing for Lucene (glp4lucene). Although it was originally developed to work with German texts, it is to a large degree language independent. It aims at facilitating four language processing steps for working with non-English texts and Apache Lucene/Solr: lemmatizing words, weighting terms based on their part-of-speech, adding synonyms and decompounding nouns, without the necessity of a thorough understanding of natural language processing.
CITATION STYLE
Entrup, B. (2015). (German) language processing for Lucene. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9103, pp. 390–394). Springer Verlag. https://doi.org/10.1007/978-3-319-19581-0_35
Mendeley helps you to discover research relevant for your work.