A machine learning approach to information extraction

Alberto Téllez-Valero; Manuel Montes-y-Gómez; Luis Villaseñor-Pineda

Conference Proceedings

A machine learning approach to information extraction

Lecture Notes in Computer Science (2005) 3406 539-547

DOI: 10.1007/978-3-540-30586-6_58

16Citations

64Readers

Get full text

Abstract

Information extraction is concerned with applying natural language processing to automatically extract the essential details from text documents. A great disadvantage of current approaches is their intrinsic dependence to the application domain and the target language. Several machine learning techniques have been applied in order to facilitate the portability of the information extraction systems. This paper describes a general method for building an information extraction system using regular expressions along with supervised learning algorithms. In this method, the extraction decisions are lead by a set of classifiers instead of sophisticated linguistic analyses. The paper also shows a system called TOPO that allows to extract the information related with natural disasters from newspaper articles in Spanish language. Experimental results of this system indicate that the proposed method can be a practical solution for building information extraction systems reaching an F-measure as high as 72%. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Téllez-Valero, A., Montes-y-Gómez, M., & Villaseñor-Pineda, L. (2005). A machine learning approach to information extraction. In Lecture Notes in Computer Science (Vol. 3406, pp. 539–547). Springer Verlag. https://doi.org/10.1007/978-3-540-30586-6_58

A machine learning approach to information extraction

Abstract

Cite

Register to see more suggestions