Introducing a new scalable data-as-a-service cloud platform for enriching traditional text mining techniques by integrating ontology modelling and natural language processing

Alexey Cheptsov; Axel Tenschert; Paul Schmidt; Birte Glimm; Mauricio Matthesius; Thorsten Liebig

Conference Proceedings

Introducing a new scalable data-as-a-service cloud platform for enriching traditional text mining techniques by integrating ontology modelling and natural language processing

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8182 62-74

DOI: 10.1007/978-3-642-54370-8_6

4Citations

27Readers

Get full text

Abstract

A good deal of digital data produced in academia, commerce and industry is made up of a raw, unstructured text, such as Word documents, Excel tables, emails, web pages, etc., which are also often represented in a natural language. An important analytical task in a number of scientific and technological domains is to retrieve information from text data, aiming to get a deeper insight into the content represented by the data in order to obtain some useful, often not explicitly stated knowledge and facts, related to a particular domain of interest. The major challenge is the size, structural complexity, and frequency of the analysed text sets’ updates (i.e., the ‘big data’ aspect), which makes the use of traditional analysis techniques and tools impossible. We introduce an innovative approach to analyse unstructured text data. This allows for improving traditional data mining techniques by adopting algorithms from ontological domain modelling, natural language processing, and machine learning. The technique is inherently designed with parallelism in mind, which allows for high performance on large-scale Cloud computing infrastructures.

Author supplied keywords

Cite

CITATION STYLE

APA

Cheptsov, A., Tenschert, A., Schmidt, P., Glimm, B., Matthesius, M., & Liebig, T. (2014). Introducing a new scalable data-as-a-service cloud platform for enriching traditional text mining techniques by integrating ontology modelling and natural language processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8182, pp. 62–74). Springer Verlag. https://doi.org/10.1007/978-3-642-54370-8_6

Introducing a new scalable data-as-a-service cloud platform for enriching traditional text mining techniques by integrating ontology modelling and natural language processing

Abstract

Author supplied keywords

Cite

Register to see more suggestions