Supervised semantic relation mining from Linguistically noisy text documents

  • Giannone C
  • Basili R
  • Naggar P
 et al. 
  • 27

    Readers

    Mendeley users who have this article in their library.
  • 3

    Citations

    Citations of this article.

Abstract

In this paper, we present models for mining text relations between named entities, which can deal with data highly affected by linguistic noise. Our models are made robust by: (a) the exploitation of state-of-the-art statistical algorithms such as support vector machines (SVMs) along with effectiveandversatile patternminingmethods, e.g.word sequence kernels; (b) the design of specific features capa- ble of capturing long distance relationships; and (c) the use of domain prior knowledge in the form of ontological con- straints, e.g. bounds on the type of relation arguments given by the semantic categories of the involved entities. This prop- erty allows for keeping small the training data required by SVMs and consequently lowering the system design costs. We empirically tested our hybrid model in the very com- plex domain of business intelligence, where the textual data are constituted by reports on investigations into criminal enterprises based on police interrogatory reports, electronic eavesdropping and wiretaps. The target relations are typi- cally established between entities, as they are mentioned in these information sources. The experiments on mining such relations show that our approach with small training data is robust to non-conventional languages as dialects, jargon expressions or coded words typically contained in such text.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free