Issues and methodology for template design for information extraction

Boyan Onyshkevych

Conference ProceedingsOPEN ACCESS

Issues and methodology for template design for information extraction

Onyshkevych B

DOI: 10.3115/1075812.1075848

N/ACitations

63Readers

Abstract

The goal of Information Extraction tasks is to identify, categorize, classify, relate. and normalize specific information of interest found in free text, and to make that information available to a back-end data base, data fusion, or other application. A data structure referred to as a template is typically used for capturing such information, particularly in cases where the amount and complexity of information is substantial. The design of the template for such an application (or exercise) thus defines the task itself and therefore crucially affects the success of the Information Extraction attempt.This paper discusses template structure and methodological issues which arise in the template design process, within the context of a discussion of the design process itself; this paper is based on the template design process for TIPSTER/MUC5 and certain subsequent Information Extraction exercises. The first section of this paper addresses the issue of selection of the appropriate data representation (text annotation vs. flat template representation vs. object-oriented template). The second section outlines a set of high-level design considerations (desiderata) that have emerged; these desiderata feed into the discussion of design elements and a procedural review of the design process (design iterations, use of those linguistic analysis tools, etc.)

Cite

CITATION STYLE

APA

Onyshkevych, B. (1994). Issues and methodology for template design for information extraction (p. 171). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1075812.1075848

Issues and methodology for template design for information extraction

Abstract

Cite

Register to see more suggestions