Issues and methodology for template design for information extraction

  • Onyshkevych B
  • 8

    Readers

    Mendeley users who have this article in their library.
  • N/A

    Citations

    Citations of this article.

Abstract

The goal of Information Extraction tasks is to identify, categorize, classify, relate. and normalize specific information of interest found in free text, and to make that information available to a back-end data base, data fusion, or other application. A data structure referred to as a template is typically used for capturing such information, particularly in cases where the amount and complexity of information is substantial. The design of the template for such an application (or exercise) thus defines the task itself and therefore crucially affects the success of the Information Extraction attempt.This paper discusses template structure and methodological issues which arise in the template design process, within the context of a discussion of the design process itself; this paper is based on the template design process for TIPSTER/MUC5 and certain subsequent Information Extraction exercises. The first section of this paper addresses the issue of selection of the appropriate data representation (text annotation vs. flat template representation vs. object-oriented template). The second section outlines a set of high-level design considerations (desiderata) that have emerged; these desiderata feed into the discussion of design elements and a procedural review of the design process (design iterations, use of those linguistic analysis tools, etc.)

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Boyan Onyshkevych

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free