LeNER-Br: A Dataset for Named Entity Recognition in Brazilian Legal Text

Pedro Henrique Luz de Araujo; Teófilo E. de Campos; Renato R.R. de Oliveira; Matheus Stauffer; Samuel Couto; Paulo Bermejo

Conference Proceedings

LeNER-Br: A Dataset for Named Entity Recognition in Brazilian Legal Text

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11122 LNAI 313-323

DOI: 10.1007/978-3-319-99722-3_32

45Citations

67Readers

Get full text

Abstract

Named entity recognition systems have the untapped potential to extract information from legal documents, which can improve information retrieval and decision-making processes. In this paper, a dataset for named entity recognition in Brazilian legal documents is presented. Unlike other Portuguese language datasets, this dataset is composed entirely of legal documents. In addition to tags for persons, locations, time entities and organizations, the dataset contains specific tags for law and legal cases entities. To establish a set of baseline results, we first performed experiments on another Portuguese dataset: Paramopama. This evaluation demonstrate that LSTM-CRF gives results that are significantly better than those previously reported. We then retrained LSTM-CRF, on our dataset and obtained F 1 scores of 97.04% and 88.82% for Legislation and Legal case entities, respectively. These results show the viability of the proposed dataset for legal applications.

Author supplied keywords

Cite

CITATION STYLE

APA

Luz de Araujo, P. H., de Campos, T. E., de Oliveira, R. R. R., Stauffer, M., Couto, S., & Bermejo, P. (2018). LeNER-Br: A Dataset for Named Entity Recognition in Brazilian Legal Text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11122 LNAI, pp. 313–323). Springer Verlag. https://doi.org/10.1007/978-3-319-99722-3_32

LeNER-Br: A Dataset for Named Entity Recognition in Brazilian Legal Text

Abstract

Author supplied keywords

Cite

Register to see more suggestions