Bete: A Brazilian Portuguese Dataset for Named Entity Recognition and Relation Extraction in the Diabetes Healthcare Domain

Lucas Pavanelli; Yohan Bonescki Gumiel; Thiago Ferreira; Adriana Pagano; Eduardo Laber

Conference Proceedings

Bete: A Brazilian Portuguese Dataset for Named Entity Recognition and Relation Extraction in the Diabetes Healthcare Domain

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023) 14197 LNAI 256-267

DOI: 10.1007/978-3-031-45392-2_17

0Citations

2Readers

Get full text

Abstract

The biomedical NLP community has seen great advances in dataset development mostly for the English language, which has hindered progress in the field, as other languages are still underrepresented. This study introduces a dataset of Brazilian Portuguese annotated for named entity recognition and relation extraction in the healthcare domain. We compiled and annotated a corpus of health professionals’ responses to frequently asked questions in online healthcare forums on diabetes. We measured inter-annotator agreement and conducted initial experiments using up-to-date methods to recognize entities and extract relations, such as BERT-based ones. Data, models, and results are publicly available at https://github.com/pavalucas/Bete.

Author supplied keywords

Cite

CITATION STYLE

APA

Pavanelli, L., Gumiel, Y. B., Ferreira, T., Pagano, A., & Laber, E. (2023). Bete: A Brazilian Portuguese Dataset for Named Entity Recognition and Relation Extraction in the Diabetes Healthcare Domain. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14197 LNAI, pp. 256–267). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-45392-2_17

Bete: A Brazilian Portuguese Dataset for Named Entity Recognition and Relation Extraction in the Diabetes Healthcare Domain

Abstract

Author supplied keywords

Cite

Register to see more suggestions