Towards geological knowledge discovery using vector-based semantic similarity

12Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is not uncommon for large organisations and corporations to routinely produce various kinds of reports indefinitely. Apart from archiving them and the occasional retrieval of some, very little can be done to take advantage of these massive resources for valuable knowledge discovery. The under-utilised unstructured data written in natural language text is often referred to as part of the “dark data”. The good news is, recent success of learning distributed representation of words in vector spaces, especially, the similarity and analogy queries enabled by the so-learned word vectors drive a paradigm shift from “document retrieval” to “knowledge retrieval”. In this paper, we investigated how representational learning of words can affect the entity query results from a large domain corpus of geological survey reports. Extensive similarity tests and analogy queries have been performed. It demonstrated the necessity of training domain-specific word embeddings, as pre-trained embeddings are good at capturing morphological relations, but are inadequate for domain specific semantic relations. Carrying out entity extractions prior to word embedding training will further improve the quality of analogy query results. The framework developed in this paper can also be readily applied to other domain specific corpus.

Cite

CITATION STYLE

APA

Enkhsaikhan, M., Liu, W., Holden, E. J., & Duuring, P. (2018). Towards geological knowledge discovery using vector-based semantic similarity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11323 LNAI, pp. 224–237). Springer Verlag. https://doi.org/10.1007/978-3-030-05090-0_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free