Abstract
In this paper, we propose a new idea of automatically recognizing domain specific terms from monolingual corpus. The majority of domain specific terms are compound nouns that we aim at extracting. Our idea is based on single-noun statistics calculated with single-noun bigrams. Namely we focus on how many nouns adjoin and frequency of each compound nouns and single-nouns, which we call FLR method. We experimentally evaluate these methods on NTCIR1 TMREC test collection. As the results, when we take into account less than 1,400 or more than 12,000 highest term candidates, FLR method performs best.
Cite
CITATION STYLE
NAKAGAWA, H., YUMOTO, H., & MORI, T. (2003). Term Extraction Based on Occurrence and Concatenation Frequency. Journal of Natural Language Processing, 10(1), 27–45. https://doi.org/10.5715/jnlp.10.27
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.