This chapter presents a model for knowledge extraction from documents written in natural language. The model relies on a clear distinction between a conceptual level, which models the domain knowledge, and a lexical level, which represents the domain vocabulary. An advanced stochastic model (which mixes, in a novel way, two well-known approaches) stores the mapping between such levels, taking in account the linguistic context of words. Such a stochastic model is then used to disambiguate documents' words, during the indexing phase. The engine supports simple keyword-based queries, as well as natural language-based queries. The system is able to extend the domain knowledge, by means of a production-rules engine. The validation tests indicate that the system is able to extract concepts with good accuracy, even if the train set is small. © 2012 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Sbattella, L., & Tedesco, R. (2012). Knowledge extraction from natural language processing. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7200 LNCS, 193–219. https://doi.org/10.1007/978-3-642-31739-2_10
Mendeley helps you to discover research relevant for your work.